You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment. If the issue is assigned to the "modular-magician" user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If the issue is assigned to a user, that user is claiming responsibility for the issue. If the issue is assigned to "hashibot", a community member has claimed the issue already.
Description
Making adjustments to the node pool (ex: preemptible to non-preemptable) will result in the node pool getting erased and recreated.
The way Terraform undertakes this, the workloads will always be interrupted.
It would be possible to perform this adjustment without causing interruption by:
Create new node pool with a differing name
Wait for workloads to shift
Destroy old node pool
The above process could happen if the end user adjusted the name of the node pool based on a preemptible variable.
The only remaining reason this won't work is terraform-provider-google doesn't prioritize pool creations over deletions.
In addition to the outage above, i've seen terraform destroy the node pool and not recreate it when the preemption setting is changed. (leaving the gke cluster with zero node pools)
New or Affected Resource(s)
google_container_node_pool
Potential Terraform Configuration
Any basic gke cluster with "delete_initial_node_pool" set to true.
Google made google_container_node_pool resource immutable so any changes need to applied to a new nodepool. You can't update an existing nodepool.
You can leverage resource lifecycles and name_prefix to ensure that a new nodepool is created before the old one is destroyed. Changing nodepools is always disruptive so design your kubernetes workloads accordingly.
I'd recommend doing this across two applies- create the new pool in an apply, cordon the nodes in Kubernetes, and then delete the now-unused pool. There isn't a great way to expose a better UX in Terraform alone, unfortunately!
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
Community Note
Description
Making adjustments to the node pool (ex: preemptible to non-preemptable) will result in the node pool getting erased and recreated.
The way Terraform undertakes this, the workloads will always be interrupted.
It would be possible to perform this adjustment without causing interruption by:
The above process could happen if the end user adjusted the name of the node pool based on a preemptible variable.
The only remaining reason this won't work is terraform-provider-google doesn't prioritize pool creations over deletions.
In addition to the outage above, i've seen terraform destroy the node pool and not recreate it when the preemption setting is changed. (leaving the gke cluster with zero node pools)
New or Affected Resource(s)
Potential Terraform Configuration
Any basic gke cluster with "delete_initial_node_pool" set to true.
References
https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/container_node_pool
The text was updated successfully, but these errors were encountered: