-
-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When Kubernetes API Fails, Traefik starts returning 404s #1240
Comments
@regner do you possibly work on this one? Just for the book-keeping. :-) |
Yea, the fix is easy but updating the tests to cover it is the more difficult part. I am busy during the week here but would love to take a shot at fixing it this coming weekend if there isn't a rush on it. |
@regner no rush from my end, take your time. I suppose you saw my response on Slack to your testing question. Feel free to reach out again if I can assist somehow. 😃 |
Will do! Thanks. :) |
Currently if a Kubernetes API call fails we potentially remove a working service from Traefik. This changes it so if a Kubernetes API call fails we abort out of the ingress update and use the current working config. Github issue: traefik#1240
Currently if a Kubernetes API call fails we potentially remove a working service from Traefik. This changes it so if a Kubernetes API call fails we abort out of the ingress update and use the current working config. Github issue: traefik#1240 Also added a test to cover when requested resources (services and endpoints) that the user has specified don’t exist.
* Abort Kubernetes Ingress update if Kubernetes API call fails Currently if a Kubernetes API call fails we potentially remove a working service from Traefik. This changes it so if a Kubernetes API call fails we abort out of the ingress update and use the current working config. Github issue: #1240 Also added a test to cover when requested resources (services and endpoints) that the user has specified don’t exist. * Specifically capturing the tc range as documented here: https://blog.golang.org/subtests * Updating service names in the mock data to be more clear * Updated expected data to match what currently happens in the loadIngress * Adding a blank Servers to the expected output so we compare against that instead of nil. * Replacing the JSON test output with spew for the TestMissingResources test to help ensure we have useful output incase of failures * Adding a temporary fix to the GetEndoints mocked function so we can override the return value for if the endpoints exist. After the 1.2 release the use of properExists should be removed and the GetEndpoints function should return false for the second value indicating the endpoint doesn’t exist. However at this time that would break a lot of the tests. * Adding quick TODO line about removing the properExists property * Link to issue 1307 re: properExists flag.
Should be fixed by #1295 (which went into the 1.2 branch). |
This looks like a duplicate of #912 ? |
* Abort Kubernetes Ingress update if Kubernetes API call fails Currently if a Kubernetes API call fails we potentially remove a working service from Traefik. This changes it so if a Kubernetes API call fails we abort out of the ingress update and use the current working config. Github issue: #1240 Also added a test to cover when requested resources (services and endpoints) that the user has specified don’t exist. * Specifically capturing the tc range as documented here: https://blog.golang.org/subtests * Updating service names in the mock data to be more clear * Updated expected data to match what currently happens in the loadIngress * Adding a blank Servers to the expected output so we compare against that instead of nil. * Replacing the JSON test output with spew for the TestMissingResources test to help ensure we have useful output incase of failures * Adding a temporary fix to the GetEndoints mocked function so we can override the return value for if the endpoints exist. After the 1.2 release the use of properExists should be removed and the GetEndpoints function should return false for the second value indicating the endpoint doesn’t exist. However at this time that would break a lot of the tests. * Adding quick TODO line about removing the properExists property * Link to issue 1307 re: properExists flag.
* Abort Kubernetes Ingress update if Kubernetes API call fails Currently if a Kubernetes API call fails we potentially remove a working service from Traefik. This changes it so if a Kubernetes API call fails we abort out of the ingress update and use the current working config. Github issue: #1240 Also added a test to cover when requested resources (services and endpoints) that the user has specified don’t exist. * Specifically capturing the tc range as documented here: https://blog.golang.org/subtests * Updating service names in the mock data to be more clear * Updated expected data to match what currently happens in the loadIngress * Adding a blank Servers to the expected output so we compare against that instead of nil. * Replacing the JSON test output with spew for the TestMissingResources test to help ensure we have useful output incase of failures * Adding a temporary fix to the GetEndoints mocked function so we can override the return value for if the endpoints exist. After the 1.2 release the use of properExists should be removed and the GetEndpoints function should return false for the second value indicating the endpoint doesn’t exist. However at this time that would break a lot of the tests. * Adding quick TODO line about removing the properExists property * Link to issue 1307 re: properExists flag.
* Abort Kubernetes Ingress update if Kubernetes API call fails Currently if a Kubernetes API call fails we potentially remove a working service from Traefik. This changes it so if a Kubernetes API call fails we abort out of the ingress update and use the current working config. Github issue: #1240 Also added a test to cover when requested resources (services and endpoints) that the user has specified don’t exist. * Specifically capturing the tc range as documented here: https://blog.golang.org/subtests * Updating service names in the mock data to be more clear * Updated expected data to match what currently happens in the loadIngress * Adding a blank Servers to the expected output so we compare against that instead of nil. * Replacing the JSON test output with spew for the TestMissingResources test to help ensure we have useful output incase of failures * Adding a temporary fix to the GetEndoints mocked function so we can override the return value for if the endpoints exist. After the 1.2 release the use of properExists should be removed and the GetEndpoints function should return false for the second value indicating the endpoint doesn’t exist. However at this time that would break a lot of the tests. * Adding quick TODO line about removing the properExists property * Link to issue 1307 re: properExists flag.
* Abort Kubernetes Ingress update if Kubernetes API call fails Currently if a Kubernetes API call fails we potentially remove a working service from Traefik. This changes it so if a Kubernetes API call fails we abort out of the ingress update and use the current working config. Github issue: #1240 Also added a test to cover when requested resources (services and endpoints) that the user has specified don’t exist. * Specifically capturing the tc range as documented here: https://blog.golang.org/subtests * Updating service names in the mock data to be more clear * Updated expected data to match what currently happens in the loadIngress * Adding a blank Servers to the expected output so we compare against that instead of nil. * Replacing the JSON test output with spew for the TestMissingResources test to help ensure we have useful output incase of failures * Adding a temporary fix to the GetEndoints mocked function so we can override the return value for if the endpoints exist. After the 1.2 release the use of properExists should be removed and the GetEndpoints function should return false for the second value indicating the endpoint doesn’t exist. However at this time that would break a lot of the tests. * Adding quick TODO line about removing the properExists property * Link to issue 1307 re: properExists flag.
* Abort Kubernetes Ingress update if Kubernetes API call fails Currently if a Kubernetes API call fails we potentially remove a working service from Traefik. This changes it so if a Kubernetes API call fails we abort out of the ingress update and use the current working config. Github issue: #1240 Also added a test to cover when requested resources (services and endpoints) that the user has specified don’t exist. * Specifically capturing the tc range as documented here: https://blog.golang.org/subtests * Updating service names in the mock data to be more clear * Updated expected data to match what currently happens in the loadIngress * Adding a blank Servers to the expected output so we compare against that instead of nil. * Replacing the JSON test output with spew for the TestMissingResources test to help ensure we have useful output incase of failures * Adding a temporary fix to the GetEndoints mocked function so we can override the return value for if the endpoints exist. After the 1.2 release the use of properExists should be removed and the GetEndpoints function should return false for the second value indicating the endpoint doesn’t exist. However at this time that would break a lot of the tests. * Adding quick TODO line about removing the properExists property * Link to issue 1307 re: properExists flag.
What version of Traefik are you using (
traefik version
)?1.1.3
What is your environment & configuration (arguments, toml...)?
Kubernetes ingress controllers
What did you do?
When kube api returns error during sync, traefik starts returning 404s.
What did you expect to see?
Traefik continues to run with last known configuration.
What did you see instead?
Traefik wipes config and starts returning 404s until API calls start succeeding.
Traefik maintains a cached state of the last good configuration. It should continue to use the last good configuration when it encounters an API error during sync process. Right now it just logs the error and throws it away, and then behaves as if the sync succeeded and swaps out the last known config with a corrupt config.
I believe if it were to return errors here (and other places where it is currently throwing away the errors):
/~https://github.com/containous/traefik/blob/master/provider/kubernetes.go#L194
Then it wouldn't override the config here, and the last good state would be maintained instead of being cleared out:
/~https://github.com/containous/traefik/blob/master/provider/kubernetes.go#L88
The text was updated successfully, but these errors were encountered: