-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
k3s 0.5 agent node fail to create nginx pod when starting as systemd service #478
Comments
I have the same issue. Funny enough the k3s server which also includes the agent works just fine. Are you sure that it is caused by systemd? |
I am not sure if it is directly related to systemd, maybe I can add debug level to kubelet and hopefully that could give more clue |
I attached the k3s agent node kubelet with log level to 4, it appears kubelet fail to create sandbox for the nignx pod, there are some errors factory.go Factory "systemd", not sure if it is related root@Office-R220-vli:/home/vincent# kubectl get po -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-kzbq6 0/1 ContainerCreating 0 24m home-ubuntu nginx-spws4 1/1 Running 0 24m 10.42.0.23 office-r220-vli search "nginx-kzbq6" in the debug log attached systemd service to start k3s agent node [Unit] Description=Lightweight Kubernetes Documentation=https://k3s.io After=network-online.target [Service] Type=notify ExecStart=/usr/local/bin/k3s agent --kubelet-arg "v=4" -s https://192.168.1.30:6443 -t "K102fcec9c1bd7ecdeb16a471571d2fb3abe6d0a3d49cb69b029d0264ea78a71e3c::node:099ebc33d9626443a5290c3cc146602a" KillMode=process Delegate=yes LimitNOFILE=infinity LimitNPROC=infinity LimitCORE=infinity TasksMax=infinity [Install] WantedBy=multi-user.target |
@ibuildthecloud is this issue fixed in 0.6.0-rc2? |
version 0.6.0-rc3 suffer from the same problem:
|
I am not sure I can reproduce the issue, steps:
Results: I can see all pods in a running state normally, @vincentmli can you describe the exact steps for reproducing the problem |
Are you also running Ubuntu 1804? Maybe it is caused by Ubuntu |
I tried following k3s-agent services on centos 7, with v0.5.0, it appears to be working @galal-hussein [root@k3s-agent ~]# cat /etc/systemd/system/k3s-agent.service [Unit] Description=Lightweight Kubernetes Documentation=https://k3s.io After=network-online.target [Service] Type=exec Environment="HTTP_PROXY=http://10.3.254.254:3128/" Environment="HTTPS_PROXY=http://10.3.254.254:3128/" EnvironmentFile=/etc/systemd/system/k3s-agent.service.env ExecStartPre=-/sbin/modprobe br_netfilter ExecStartPre=-/sbin/modprobe overlay ExecStart=/usr/local/bin/k3s agent --server https://10.3.72.189:6443 --token-file /usr/local/bin/node-token --flannel-iface ens224 --node-ip 10.169.72.98 KillMode=process Delegate=yes LimitNOFILE=infinity LimitNPROC=infinity LimitCORE=infinity TasksMax=infinity TimeoutStartSec=0 Restart=always [Install] WantedBy=multi-user.target [root@rancher-k3s home]# kubectl get po -o wide --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES default nginx-7rlzc 1/1 Running 0 3m51s 10.42.0.205 rancher-k3s default nginx-pjh5g 1/1 Running 0 3m51s 10.42.4.4 k3s-agent kube-system cc-cluster-574dc9565c-kmkhk 1/1 Running 0 24s 10.169.72.98 k3s-agent kube-system coredns-695688789-sjt4j 1/1 Running 0 15d 10.42.0.196 rancher-k3s |
k3s 0.6.1 worker still does not work with ubuntu 18.04:
@galal-hussein is there anything (furhter logs or something like that) you need from us in order to fix these kind of issues with ubuntu? |
I fixed with network device cleanup
|
I've the same problem with my setup. The master+agent node is able to run pods, the node started only in agent mode cannot. Here is some logs from
I've updated CentOS 7 to the latest release:
I'm running the same version on both master and agent. I've installed the latest version of k3s. |
I'm seeing this same issue with k3s v0.6.1 on Xenial and Bionic. When I switch from using the example systemd unit file in the git repo to running the command via sudo in an interactive shell, the pods immediately spring to life and I get no "context deadline exceeded" errors. |
After looking over the install script, I realized that there's a small difference between the systemd unit files created by the installer script and the example unit file in the root of the github repo. For the server, the installer script uses From reading the systemd manual, the only difference between these options is how/when systemd considers the service as started in order to trigger follow-up units. I'm guessing that systemd also performs some other actions that are important for the proper functioning of the agent processes, but it never does them because it's waiting to be notified by the agent process. |
That makes sense, thanks for pointing this out @agaffney. We should modify the install script to change the type for agents, or maybe better add some code to send a systemd notification from the agent also. |
The install script already does the right thing. The problem is that not everybody uses the installer (I'm not a fan of curlpipes). It would probably be a good idea to update the example systemd unit file in the repo and docs to reflect this difference, but modifying the agent so that there is no difference is also a good idea. |
Describe the bug
A clear and concise description of what the bug is.
running k3s agent node as systemd service as below cause agent node fail to create pod
start k3s agent service as:
deploy nginx pod and service : kubectl apply -f nginx_cluster_pod_service.yaml
nginx pod in agent node stuck in ContainerCreating
running k3s agent node from command line as below, nginx pod is able to be created
To Reproduce
Steps to reproduce the behavior:
download k3s 0.5 and run k3s agent node from systemd service as above example
Expected behavior
A clear and concise description of what you expected to happen.
k3s agent node should be able to run from systemd service and able to create pod
Screenshots
If applicable, add screenshots to help explain your problem.
k3s agent node startup logs from systemd service
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: