Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingress controller dropping websocket connections when performing backend reload #2461

Closed
AssafShaikevich opened this issue May 2, 2018 · 9 comments

Comments

@AssafShaikevich
Copy link

NGINX Ingress controller version: 0.12.0 and 0.14.0

Kubernetes version (use kubectl version): v1.8.0

Environment: aws with elb

  • Cloud provider or hardware configuration: aws m4 medium
  • OS (e.g. from /etc/os-release):16.04.1 LTS
  • Kernel (e.g. uname -a): 4.4.0-119-generic x86_64 GNU/Linux
  • Install tools: kops
  • Others:

What happened: Ingress controller dropping websocket connections when performing backend reload

What you expected to happen: Websockets should be left connected to the target server

How to reproduce it (as minimally and precisely as possible):
Load a simple websocket service into k8s and create ingress rule for it
open a websocket client and connect (for example - https://websocket.org/echo.html )
Cause ingress to perform reload backend (for example add or delete an ingress rule)

Anything else we need to know:
I verified this with 0.12 and 0.14 versions .

@ElvinEfendi
Copy link
Member

To my knowledge this is how Nginx behaves on reloads. That being said you can enable dynamic mode using --enable-dynamic-configuration to avoid reloads on backend changes.

@JordanP
Copy link
Contributor

JordanP commented May 2, 2018

Or use the worker-shutdown-timeout directive in your configuration config map

@AssafShaikevich
Copy link
Author

Thx for the replay @JordanP and @ElvinEfendi ,
using --enable-dynamic-configuration still causes less offten but still happening
to our understanding websockets should be persistent connections handling to the backend
are we wrong to use nginx to proxy these connections ?
can we do somthing like nginx -s reload to keep the connections alive ?

@aledbf
Copy link
Member

aledbf commented May 3, 2018

still causes less offten but still happening

Ok, can you tell us what are you chaning? Maybe you are deploying a new version of your app?

to our understanding websockets should be persistent connections handling to the backend are we wrong to use nginx to proxy these connections ?

This is not an issue with nginx in particular, you will face the same issue with other load balancers. The issue here is that you need to drain the connections before replacing the old pods (once kubernetes removes the pod from ready we cannot do anything about it)

Please check #322 (comment)

Also please adjust the value of worker-shutdown-timeout

@aledbf
Copy link
Member

aledbf commented May 3, 2018

can we do somthing like nginx -s reload to keep the connections alive ?

That is what happens now.

@AssafShaikevich
Copy link
Author

@aledbf actually because the ingress controller is a cross-cluster service (lots of pods from diffrent namespaces are reversed from it ) each time a change event is triggered by it causes a reload and there for disconnects our WebSocket.
for example, we have our websocket pod working and clients connected to it ,
then we deploy a new http service pod with a new ingress rule in a different namespace causing the nginx controller to reload and closing the WebSocket's pod connections.
so its kind of unwanted behavior

@JordanP
Copy link
Contributor

JordanP commented May 3, 2018

You may want to read http://danielfm.me/posts/painless-nginx-ingress.html and especially the "Ingress Classes To The Rescue" section.

@wincus
Copy link

wincus commented May 8, 2018

For future reference ... setting worker-shutdown-timeout long enough ( 24hs ) fixed the issue for me on version /~https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.9.0

@ElvinEfendi
Copy link
Member

ElvinEfendi commented May 8, 2018

just want to note that worker-shutdown-timeout might have unexpected side effect - every time Nginx receives reload signal it will spin up new configured number of workers without shutting down the current ones until 24 hours pass or there's no active connection anymore. And if you have controller reloading Nginx enough times then you can get your pod OOM killed.

@aledbf aledbf closed this as completed Jun 22, 2018
kerin pushed a commit to ministryofjustice/analytics-platform-config that referenced this issue Aug 29, 2018
increases client_body_buffer_size and worker_shutdown_timeout to handle websockets disconnection and reconnection more gracefully.

See: kubernetes/ingress-nginx#2461
kerin pushed a commit to ministryofjustice/analytics-platform-config that referenced this issue Aug 29, 2018
* Ingress-Nginx: Upgrade

[Trello](https://trello.com/c/M1snktNZ)

Relates to
[PR](/~https://github.com/ministryofjustice/analytics-platform-helm-charts/pull/197)

Increased ssl-session and proxy-read timeouts to prevent `nginx` from
closing websocket connections to frontend tools.

Added default ssl cert argument.  We are begining to use [cert-manager](/~https://github.com/jetstack/cert-manager)
which stores resulting tls certs into secrets.  We all know k8s secrets
are namespaced scoped, so without a default tls cert provided to nginx
ingress.  We would need to ensure that the secret existed in every
namepsace there was an ingress object that needed to use it.

This also introduces a wierd explicit dependency in that we have to
ensure the certificate exists first before we deploy nginx-ingress.  Then we have to ensure the
k8s secret's name matches that what we have provided in the config.

So in theory nginx-ingress chart has a dependency on the
cert-manager-resources chart.

* Update nginx-ingress config for alpha and dev

increases client_body_buffer_size and worker_shutdown_timeout to handle websockets disconnection and reconnection more gracefully.

See: kubernetes/ingress-nginx#2461

* Run 3 nginx-ingress replicas in alpha env
dhirajnarwanimoj pushed a commit to ministryofjustice/analytics-platform-config that referenced this issue Apr 6, 2020
* Ingress-Nginx: Upgrade

[Trello](https://trello.com/c/M1snktNZ)

Relates to
[PR](/~https://github.com/ministryofjustice/analytics-platform-helm-charts/pull/197)

Increased ssl-session and proxy-read timeouts to prevent `nginx` from
closing websocket connections to frontend tools.

Added default ssl cert argument.  We are begining to use [cert-manager](/~https://github.com/jetstack/cert-manager)
which stores resulting tls certs into secrets.  We all know k8s secrets
are namespaced scoped, so without a default tls cert provided to nginx
ingress.  We would need to ensure that the secret existed in every
namepsace there was an ingress object that needed to use it.

This also introduces a wierd explicit dependency in that we have to
ensure the certificate exists first before we deploy nginx-ingress.  Then we have to ensure the
k8s secret's name matches that what we have provided in the config.

So in theory nginx-ingress chart has a dependency on the
cert-manager-resources chart.

* Update nginx-ingress config for alpha and dev

increases client_body_buffer_size and worker_shutdown_timeout to handle websockets disconnection and reconnection more gracefully.

See: kubernetes/ingress-nginx#2461

* Run 3 nginx-ingress replicas in alpha env
dhirajnarwanimoj pushed a commit to ministryofjustice/analytics-platform-config that referenced this issue Apr 6, 2020
* Ingress-Nginx: Upgrade

[Trello](https://trello.com/c/M1snktNZ)

Relates to
[PR](/~https://github.com/ministryofjustice/analytics-platform-helm-charts/pull/197)

Increased ssl-session and proxy-read timeouts to prevent `nginx` from
closing websocket connections to frontend tools.

Added default ssl cert argument.  We are begining to use [cert-manager](/~https://github.com/jetstack/cert-manager)
which stores resulting tls certs into secrets.  We all know k8s secrets
are namespaced scoped, so without a default tls cert provided to nginx
ingress.  We would need to ensure that the secret existed in every
namepsace there was an ingress object that needed to use it.

This also introduces a wierd explicit dependency in that we have to
ensure the certificate exists first before we deploy nginx-ingress.  Then we have to ensure the
k8s secret's name matches that what we have provided in the config.

So in theory nginx-ingress chart has a dependency on the
cert-manager-resources chart.

* Update nginx-ingress config for alpha and dev

increases client_body_buffer_size and worker_shutdown_timeout to handle websockets disconnection and reconnection more gracefully.

See: kubernetes/ingress-nginx#2461

* Run 3 nginx-ingress replicas in alpha env
dhirajnarwanimoj pushed a commit to ministryofjustice/analytics-platform-config that referenced this issue Apr 8, 2020
* Ingress-Nginx: Upgrade

[Trello](https://trello.com/c/M1snktNZ)

Relates to
[PR](/~https://github.com/ministryofjustice/analytics-platform-helm-charts/pull/197)

Increased ssl-session and proxy-read timeouts to prevent `nginx` from
closing websocket connections to frontend tools.

Added default ssl cert argument.  We are begining to use [cert-manager](/~https://github.com/jetstack/cert-manager)
which stores resulting tls certs into secrets.  We all know k8s secrets
are namespaced scoped, so without a default tls cert provided to nginx
ingress.  We would need to ensure that the secret existed in every
namepsace there was an ingress object that needed to use it.

This also introduces a wierd explicit dependency in that we have to
ensure the certificate exists first before we deploy nginx-ingress.  Then we have to ensure the
k8s secret's name matches that what we have provided in the config.

So in theory nginx-ingress chart has a dependency on the
cert-manager-resources chart.

* Update nginx-ingress config for alpha and dev

increases client_body_buffer_size and worker_shutdown_timeout to handle websockets disconnection and reconnection more gracefully.

See: kubernetes/ingress-nginx#2461

* Run 3 nginx-ingress replicas in alpha env
dhirajnarwanimoj pushed a commit to ministryofjustice/analytics-platform-config that referenced this issue Apr 8, 2020
* Ingress-Nginx: Upgrade

[Trello](https://trello.com/c/M1snktNZ)

Relates to
[PR](/~https://github.com/ministryofjustice/analytics-platform-helm-charts/pull/197)

Increased ssl-session and proxy-read timeouts to prevent `nginx` from
closing websocket connections to frontend tools.

Added default ssl cert argument.  We are begining to use [cert-manager](/~https://github.com/jetstack/cert-manager)
which stores resulting tls certs into secrets.  We all know k8s secrets
are namespaced scoped, so without a default tls cert provided to nginx
ingress.  We would need to ensure that the secret existed in every
namepsace there was an ingress object that needed to use it.

This also introduces a wierd explicit dependency in that we have to
ensure the certificate exists first before we deploy nginx-ingress.  Then we have to ensure the
k8s secret's name matches that what we have provided in the config.

So in theory nginx-ingress chart has a dependency on the
cert-manager-resources chart.

* Update nginx-ingress config for alpha and dev

increases client_body_buffer_size and worker_shutdown_timeout to handle websockets disconnection and reconnection more gracefully.

See: kubernetes/ingress-nginx#2461

* Run 3 nginx-ingress replicas in alpha env
dhirajnarwanimoj pushed a commit to ministryofjustice/analytics-platform-config that referenced this issue Apr 8, 2020
* Ingress-Nginx: Upgrade

[Trello](https://trello.com/c/M1snktNZ)

Relates to
[PR](/~https://github.com/ministryofjustice/analytics-platform-helm-charts/pull/197)

Increased ssl-session and proxy-read timeouts to prevent `nginx` from
closing websocket connections to frontend tools.

Added default ssl cert argument.  We are begining to use [cert-manager](/~https://github.com/jetstack/cert-manager)
which stores resulting tls certs into secrets.  We all know k8s secrets
are namespaced scoped, so without a default tls cert provided to nginx
ingress.  We would need to ensure that the secret existed in every
namepsace there was an ingress object that needed to use it.

This also introduces a wierd explicit dependency in that we have to
ensure the certificate exists first before we deploy nginx-ingress.  Then we have to ensure the
k8s secret's name matches that what we have provided in the config.

So in theory nginx-ingress chart has a dependency on the
cert-manager-resources chart.

* Update nginx-ingress config for alpha and dev

increases client_body_buffer_size and worker_shutdown_timeout to handle websockets disconnection and reconnection more gracefully.

See: kubernetes/ingress-nginx#2461

* Run 3 nginx-ingress replicas in alpha env
dhirajnarwanimoj pushed a commit to ministryofjustice/analytics-platform-config that referenced this issue Apr 9, 2020
* Ingress-Nginx: Upgrade

[Trello](https://trello.com/c/M1snktNZ)

Relates to
[PR](/~https://github.com/ministryofjustice/analytics-platform-helm-charts/pull/197)

Increased ssl-session and proxy-read timeouts to prevent `nginx` from
closing websocket connections to frontend tools.

Added default ssl cert argument.  We are begining to use [cert-manager](/~https://github.com/jetstack/cert-manager)
which stores resulting tls certs into secrets.  We all know k8s secrets
are namespaced scoped, so without a default tls cert provided to nginx
ingress.  We would need to ensure that the secret existed in every
namepsace there was an ingress object that needed to use it.

This also introduces a wierd explicit dependency in that we have to
ensure the certificate exists first before we deploy nginx-ingress.  Then we have to ensure the
k8s secret's name matches that what we have provided in the config.

So in theory nginx-ingress chart has a dependency on the
cert-manager-resources chart.

* Update nginx-ingress config for alpha and dev

increases client_body_buffer_size and worker_shutdown_timeout to handle websockets disconnection and reconnection more gracefully.

See: kubernetes/ingress-nginx#2461

* Run 3 nginx-ingress replicas in alpha env
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants