Load balancer initialisation fails #1642
-
DescriptionI had some issues with the load balancer after an upgrade, switched to metal/klipper lb and I'm now trying to switch back. I dug a bit around and found some things, but am stuck now. tofu creates the expected load-balancer with the set name ( There are some issues at the cloud controller describing similar issues, not sure who's to blame: hetznercloud/hcloud-cloud-controller-manager#811 & hetznercloud/hcloud-cloud-controller-manager#812 Parts of output of kubectl logs -f -n kube-system deployments/hcloud-cloud-controller-manager
Kube.tf filemodule "kube-hetzner" {
providers = {
hcloud = hcloud
}
hcloud_token = var.hcloud_token
source = "kube-hetzner/kube-hetzner/hcloud"
version = "2.15.4"
ssh_public_key = data.hcloud_ssh_key.admin_key.public_key
ssh_private_key = null
ssh_hcloud_key_label = "role=admin"
ssh_max_auth_tries = 10
hcloud_ssh_key_id = data.hcloud_ssh_key.admin_key.id
control_plane_nodepools = [
{
name = "control-plane-fsn1",
server_type = "cx22",
location = "fsn1",
labels = [],
taints = [],
count = 1
zram_size = "2G"
kubelet_args = ["kube-reserved=cpu=250m,memory=1500Mi,ephemeral-storage=1Gi", "system-reserved=cpu=250m,memory=300Mi"]
},
{
name = "control-plane-nbg1",
server_type = "cx22",
location = "nbg1",
labels = [],
taints = [],
count = 1
zram_size = "2G"
kubelet_args = ["kube-reserved=cpu=250m,memory=1500Mi,ephemeral-storage=1Gi", "system-reserved=cpu=250m,memory=300Mi"]
},
{
name = "control-plane-hel1",
server_type = "cx22",
location = "hel1",
labels = [],
taints = [],
count = 1
zram_size = "2G"
kubelet_args = ["kube-reserved=cpu=250m,memory=1500Mi,ephemeral-storage=1Gi", "system-reserved=cpu=250m,memory=300Mi"]
}
]
agent_nodepools = [
{
name = "agent-small",
server_type = "cx22",
location = "fsn1",
labels = [
"node.longhorn.io/create-default-disk=config",
],
taints = [],
zram_size = "2G"
nodes = {
"0": {
},
"1": {
server_type: "cx32",
location = "nbg1",
},
}
},
{
name = "agent-arm-small",
server_type = "cax21",
location = "fsn1",
labels = [
"node.longhorn.io/create-default-disk=config",
],
zram_size = "2G"
taints = [],
count = 2,
},
]
enable_wireguard = true
load_balancer_type = "lb11"
load_balancer_location = "fsn1"
base_domain = "${var.subdomain}.${var.domain}"
enable_csi_driver_smb = true
enable_longhorn = true
longhorn_namespace = "longhorn-system"
longhorn_fstype = "ext4"
longhorn_replica_count = 3
ingress_controller = "nginx"
system_upgrade_use_drain = true
initial_k3s_channel = "stable"
/* k3s_registries = <<-EOT
mirrors:
hub.my_registry.com:
endpoint:
- "hub.my_registry.com"
configs:
hub.my_registry.com:
auth:
username: username
password: password
EOT */
additional_k3s_environment = {
"CONTAINERD_HTTP_PROXY" : "http://localhost:1055",
"CONTAINERD_HTTPS_PROXY" : "http://localhost:1055",
"NO_PROXY" : "127.0.0.0/8,10.128.0.0/9,10.0.0.0/10,",
}
preinstall_exec = [
"curl -vL https://registry.gitlab.com",
]
k3s_exec_agent_args = "--kubelet-arg image-gc-high-threshold=50 --kubelet-arg=image-gc-low-threshold=45"
extra_firewall_rules = [
]
enable_cert_manager = true
dns_servers = []
lb_hostname = "${var.subdomain}.${var.domain}"
extra_kustomize_parameters = {
vpn_domain = var.vpn_domain,
}
create_kubeconfig = false
create_kustomization = false
longhorn_values = <<EOT
defaultSettings:
createDefaultDiskLabeledNodes: true
defaultDataPath: /var/longhorn
node-down-pod-deletion-policy: delete-both-statefulset-and-deployment
persistence:
defaultFsType: ext4
defaultClassReplicaCount: 3
defaultClass: true
EOT
} ScreenshotsNo response PlatformLinux |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
This behavior stems from how the Hetzner Cloud Controller Manager (hcloud-ccm) handles load balancer names. The CCM typically attempts to create or rename a load-balancer resource to match the Kubernetes In effect, you ended up with two load balancers:
Because of this naming collision, CCM's rename logic tries to rename the randomly-created LB to This scenario is also mentioned in: They discuss how the CCM tries to unify the load-balancer name with the OutcomeNo fix is needed in this module itself. It’s a known quirk/bug in the Hetzner CCM name-handling logic when there is an existing LB resource with the same name that CCM tries to manage. The recommended approach is to let either Terraform or the CCM own the LB fully. Mixing both can cause these collisions. |
Beta Was this translation helpful? Give feedback.
This behavior stems from how the Hetzner Cloud Controller Manager (hcloud-ccm) handles load balancer names. The CCM typically attempts to create or rename a load-balancer resource to match the Kubernetes
Service
object. In your setup, there is already a load balancer namedk3s-nginx
(created by Terraform), and the CCM is also trying to manage (or rename) another load balancer to the same name, which leads to theuniqueness_error
.In effect, you ended up with two load balancers:
k3s-nginx
: Unconfigured, owned by Terraform (or tofu), but recognized by the CCM as already existing.nginx-ingress-nginx-controller
Servi…