我创建了一个带有RHEL7的k8s集群,其中包含kubernetes软件包GitVersion:“ v1.8.1 ”。我正在尝试在自定义群集上部署wordpress。但是pod创建始终处于ContainerCreating状态。
phani@k8s-master]$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default wordpress-766d75457d-zlvdn 0/1 ContainerCreating 0 11m
kube-system etcd-k8s-master 1/1 Running 0 1h
kube-system kube-apiserver-k8s-master 1/1 Running 0 1h
kube-system kube-controller-manager-k8s-master 1/1 Running 0 1h
kube-system kube-dns-545bc4bfd4-bb8js 3/3 Running 0 1h
kube-system kube-proxy-bf4zr 1/1 Running 0 1h
kube-system kube-proxy-d7zvg 1/1 Running 0 34m
kube-system kube-scheduler-k8s-master 1/1 Running 0 1h
kube-system weave-net-92zf9 2/2 Running 0 34m
kube-system weave-net-sh7qk 2/2 Running 0 1h
Docker版本:1.13.1
Pod status from descibe command
Normal Scheduled 18m default-scheduler Successfully assigned wordpress-766d75457d-zlvdn to worker1
Normal SuccessfulMountVolume 18m kubelet, worker1 MountVolume.SetUp succeeded for volume "default-token-tmpcm"
Warning DNSSearchForming 18m kubelet, worker1 Search Line limits were exceeded, some dns names have been omitted, the applied search line is: default.svc.cluster.local svc.cluster.local cluster.local
Warning FailedCreatePodSandBox 14m kubelet, worker1 Failed create pod sandbox.
Warning FailedSync 25s (x8 over 14m) kubelet, worker1 Error syncing pod
Normal SandboxChanged 24s (x8 over 14m) kubelet, worker1 Pod sandbox changed, it will be killed and re-created.
从kubelet日志中我发现下面的工人错误
error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"
但是kubelet很稳定没有工人出现问题。
如何解决这个问题?
我检查了cni失败,我找不到任何东西。
~]# ls /opt/cni/bin
bridge cnitool dhcp flannel host-local ipvlan loopback macvlan noop ptp tuning weave-ipam weave-net weave-plugin-2.3.0
在下面的日志日志中,重复出现消息。似乎调度程序一直在尝试创建容器。
Jun 08 11:25:22 worker1 kubelet[14339]: E0608 11:25:22.421184 14339 remote_runtime.go:115] StopPodSandbox "47da29873230d830f0ee21adfdd3b06ed0c653a0001c29289fe78446d27d2304" from runtime service failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded
Jun 08 11:25:22 worker1 kubelet[14339]: E0608 11:25:22.421212 14339 kuberuntime_manager.go:780] Failed to stop sandbox {"docker" "47da29873230d830f0ee21adfdd3b06ed0c653a0001c29289fe78446d27d2304"}
Jun 08 11:25:22 worker1 kubelet[14339]: E0608 11:25:22.421247 14339 kuberuntime_manager.go:580] killPodWithSyncResult failed: failed to "KillPodSandbox" for "7f1c6bf1-6af3-11e8-856b-fa163e3d1891" with KillPodSandboxError: "rpc error: code = DeadlineExceeded desc = context deadline exceeded"
Jun 08 11:25:22 worker1 kubelet[14339]: E0608 11:25:22.421262 14339 pod_workers.go:182] Error syncing pod 7f1c6bf1-6af3-11e8-856b-fa163e3d1891 ("wordpress-766d75457d-spdrb_default(7f1c6bf1-6af3-11e8-856b-fa163e3d1891)"), skipping: failed to "KillPodSandbox" for "7f1c6bf1-6af3-11e8-856b-fa163e3d1891" with KillPodSandboxError: "rpc error: code = DeadlineExceeded desc = context deadline exceeded"
答案 0 :(得分:2)
创建pod沙箱失败。
......几乎总是CNI失败;我会在节点上检查所有编织容器是否满意,/opt/cni/bin
是否存在(或其编织等效物)
您可能必须检查journalctl -u kubelet.service
以及docker日志中是否有任何正在运行的容器,以发现节点上错误的完整范围。
答案 1 :(得分:1)
似乎可以删除$KUBELET_NETWORK_ARGS
中的/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
我删除了$KUBELET_NETWORK_ARGS
并重新启动了工作节点,然后成功部署了pod。
答案 2 :(得分:0)
马修说,这很可能是CNI的失败。
首先,找到此pod正在运行的节点:
kubectl get po wordpress-766d75457d-zlvdn -o wide
接下来在pod所在的节点中,如果您有多个/etc/cni/net.d
,请检查.conf
,然后您可以删除一个并重新启动该节点。
来源:https://github.com/kubernetes/kubeadm/issues/578。
请注意,这是其中一个解决方案。
答案 3 :(得分:0)
虽然希望这不是其他人的问题,但对我来说,这发生在我的部分文件系统已满时。
我的集群中只有一个节点上有 Pod 卡在 ContainerCreating
中。我也有一堆我希望关闭的豆荚,但没有。有人推荐跑步
sudo systemctl status kubelet -l
它向我展示了一堆像
<块引用>6 月 18 日 23:19:56 worker01 kubelet[1718]: E0618 23:19:56.461378 1718 kuberuntime_manager.go:647] createPodSandbox for pod "REDACTED(2c681b9c-1907cf59failure) var/log/pods/2c681b9c-cf5b-11eb-9c79-52540077cc53:设备上没有剩余空间
我确认我的空间不足
$ df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 189G 0 189G 0% /dev
tmpfs 189G 0 189G 0% /sys/fs/cgroup
/dev/mapper/vg01-root 20G 7.0G 14G 35% /
/dev/mapper/vg01-tmp 4.0G 34M 4.0G 1% /tmp
/dev/mapper/vg01-home 4.0G 72M 4.0G 2% /home
/dev/mapper/vg01-varlog 10G 10G 20K 100% /var/log
/dev/mapper/vg01-varlogaudit 2.0G 68M 2.0G 4% /var/log/audit
我只需要清除该目录(并对所有挂起的 pod 和卡住运行的 pod 进行一些手动清理)。