LeaderElection within a GKE cluster Unauthorized error

I run the very basic app in a GKE cluster. This app uses `leaderelection.RunOrDie` function. I noticed a strange behavior that sometimes when I perform a rolling update in the logs from the pod that is being terminated I can find such errors:

 

 

 

error retrieving resource lock demo/app: Unauthorized
Failed to release lock: Unauthorized

 

 

 

When a new pod is spawned, all is good, and the election works perfectly fine.
It is not always the case but I can't find any reliable reproduction path. I use the following settings for that election:

 

 

 

ReleaseOnCancel: true,
LeaseDuration: 60 * time.Second, //nolint: gomnd
RenewDeadline: 20 * time.Second, //nolint: gomnd
RetryPeriod: 10 * time.Second, //nolint: gomnd

 

 

 


Once I got the SIGTERM signal from K8S, I immediately cancel context used for the election. Afterward I wait 30s and restart my pod.

Do you have any ideas about what may be wrong? It is worth adding that I run the same app in another cloud provider and have never seen such an error.




Solved Solved
1 3 112
1 ACCEPTED SOLUTION

I owe you an explanation. The root cause of my issue was my helm deployment, which was recreating the service account. When I deployed the app, the new service account was created, but the old pod was still running. I fixed my helm chart, and the problem was solved.

View solution in original post

3 REPLIES 3

Does you pod use a persistent volume?

No, I don't use persistent volumes in that app at all.

I owe you an explanation. The root cause of my issue was my helm deployment, which was recreating the service account. When I deployed the app, the new service account was created, but the old pod was still running. I fixed my helm chart, and the problem was solved.

Top Labels in this Space