Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

split liveness and readiness probes #247

Merged
merged 1 commit into from
Mar 29, 2023

Conversation

slimm609
Copy link
Contributor

split out liveness and readiness probes. The liveness will always respond when the application is running but the readiness will not respond when a large sync is in progress.

fixes #215

martin-helmich
martin-helmich previously approved these changes Jan 4, 2023
@martin-helmich martin-helmich enabled auto-merge (squash) January 4, 2023 08:21
split out liveness and readiness probes. The liveness will always respond when
the application is running but the readiness will not respond when a large sync is in progress.

fixes mittwald#215
auto-merge was automatically disabled March 2, 2023 01:16

Head branch was pushed to by a user without write access

@slimm609 slimm609 force-pushed the split_liveness_readiness branch from b34feae to 9c61e7d Compare March 2, 2023 01:16
@slimm609
Copy link
Contributor Author

slimm609 commented Mar 2, 2023

rebased with the 1.18 updated

@slimm609
Copy link
Contributor Author

@martin-helmich this should be good now

Copy link
Member

@martin-helmich martin-helmich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@martin-helmich martin-helmich merged commit 796a76d into mittwald:master Mar 29, 2023
@cherrera-acx
Copy link

@martin-helmich - Will this be added to a future release, or is this change already available in v2.7.3?

@timhavens
Copy link

timhavens commented Mar 31, 2023

:(

I don't know if this is related, but I think it is, because the issue coincides with this change.

I think most likely it's because i'm using the 'latest' image and the yaml changed in the master branch for an endpoint that isn't hosted by the 'latest' image (yet) - doh on my part I suppose. Still, I was following the 'Manual' install in your README.md and this occurred. Maybe that needs to be updated if the latest image is still 2.7.3. It's a simple tweak to get it working like was, but it did take me a while to finally realize what was happening.

I've been using a manual install process for quite a while, and about 2 days ago I noticed Replicator's Ready == 0/1, and since then I've been trying to debug it. It appears that replicator is still actually replicating things successfully, but the readiness state never gets to 1/1 anymore.

$ # Create roles and service accounts
$ kubectl apply -f https://raw.githubusercontent.com/mittwald/kubernetes-replicator/master/deploy/rbac.yaml
$ # Create actual deployment
$ kubectl apply -f https://raw.githubusercontent.com/mittwald/kubernetes-replicator/master/deploy/deployment.yaml

I run this on an AWS EKS cluster. The pod has an event where it's reporting Readiness as a 404 response. I've confirmed that using port forwarding and hitting the endpoint for it.

http://localhost:9102/readyz
404 page not found
http://localhost:9102/healthz
{
"notReady": []
}

I also noticed in the EKS logs this entry:

{ "kind": "Event", "apiVersion": "audit.k8s.io/v1", "level": "Request", "auditID": "xxxx", "stage": "ResponseComplete", "requestURI": "/apis/rbac.authorization.k8s.io/v1/clusterroles/replicator-kubernetes-replicator", "verb": "get", "user": { "username": "kubernetes-admin", "uid": "aws-iam-authenticator:xxxx:xxxx", "groups": [ "system:masters", "system:authenticated" ], "extra": { "accessKeyId": [ "xxxx" ], "arn": [ "arn:aws:iam::xxxx:user/xxxx" ], "canonicalArn": [ "arn:aws:iam::xxxx:user/xxxx" ], "sessionName": [ "" ] } }, "sourceIPs": [ "x.x.x.x" ], "userAgent": "kubectl/v1.24.3 (linux/amd64) kubernetes/aef86a9", "objectRef": { "resource": "clusterroles", "name": "replicator-kubernetes-replicator", "apiGroup": "rbac.authorization.k8s.io", "apiVersion": "v1" }, "responseStatus": { "metadata": {}, "status": "Failure", "message": "clusterroles.rbac.authorization.k8s.io \"replicator-kubernetes-replicator\" not found", "reason": "NotFound", "details": { "name": "replicator-kubernetes-replicator", "group": "rbac.authorization.k8s.io", "kind": "clusterroles" }, "code": 404 }, "requestReceivedTimestamp": "2023-03-30T18:47:57.369815Z", "stageTimestamp": "2023-03-30T18:47:57.374817Z", "annotations": { "authorization.k8s.io/decision": "allow", "authorization.k8s.io/reason": "" } }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kubernetes-replicator pod crashes when updating secrets
4 participants