pod创建卡在ContainerCreating状态

时间:2018-06-07 14:55:41

标签: docker kubernetes

我创建了一个带有RHEL7的k8s集群,其中包含kubernetes软件包GitVersion:“ v1.8.1 ”。我正在尝试在自定义群集上部署wordpress。但是pod创建始终处于ContainerCreating状态。

phani@k8s-master]$ kubectl get pods --all-namespaces
NAMESPACE     NAME                                                        READY     STATUS              RESTARTS   AGE
default       wordpress-766d75457d-zlvdn                                  0/1       ContainerCreating   0          11m
kube-system   etcd-k8s-master                                             1/1       Running             0          1h
kube-system   kube-apiserver-k8s-master                                   1/1       Running             0          1h
kube-system   kube-controller-manager-k8s-master                          1/1       Running             0          1h
kube-system   kube-dns-545bc4bfd4-bb8js                                   3/3       Running             0          1h
kube-system   kube-proxy-bf4zr                                            1/1       Running             0          1h
kube-system   kube-proxy-d7zvg                                            1/1       Running             0          34m
kube-system   kube-scheduler-k8s-master                                   1/1       Running             0          1h
kube-system   weave-net-92zf9                                             2/2       Running             0          34m
kube-system   weave-net-sh7qk                                             2/2       Running             0          1h

Docker版本:1.13.1

Pod status from descibe command
      Normal   Scheduled               18m                default-scheduler                           Successfully assigned wordpress-766d75457d-zlvdn to worker1
      Normal   SuccessfulMountVolume   18m                kubelet, worker1                            MountVolume.SetUp succeeded for volume "default-token-tmpcm"
      Warning  DNSSearchForming        18m                kubelet, worker1                            Search Line limits were exceeded, some dns names have been omitted, the applied search line is: default.svc.cluster.local svc.cluster.local cluster.local 
      Warning  FailedCreatePodSandBox  14m                kubelet, worker1                            Failed create pod sandbox.
      Warning  FailedSync              25s (x8 over 14m)  kubelet, worker1                            Error syncing pod
      Normal   SandboxChanged          24s (x8 over 14m)  kubelet, worker1                            Pod sandbox changed, it will be killed and re-created.

从kubelet日志中我发现下面的工人错误

error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"

但是kubelet很稳定没有工人出现问题。

如何解决这个问题?

我检查了cni失败,我找不到任何东西。

~]# ls /opt/cni/bin
bridge  cnitool  dhcp  flannel  host-local  ipvlan  loopback  macvlan  noop  ptp  tuning  weave-ipam  weave-net  weave-plugin-2.3.0

在下面的日志日志中,重复出现消息。似乎调度程序一直在尝试创建容器。

Jun 08 11:25:22 worker1 kubelet[14339]: E0608 11:25:22.421184   14339 remote_runtime.go:115] StopPodSandbox "47da29873230d830f0ee21adfdd3b06ed0c653a0001c29289fe78446d27d2304" from runtime service failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded
    Jun 08 11:25:22 worker1 kubelet[14339]: E0608 11:25:22.421212   14339 kuberuntime_manager.go:780] Failed to stop sandbox {"docker" "47da29873230d830f0ee21adfdd3b06ed0c653a0001c29289fe78446d27d2304"}
    Jun 08 11:25:22 worker1 kubelet[14339]: E0608 11:25:22.421247   14339 kuberuntime_manager.go:580] killPodWithSyncResult failed: failed to "KillPodSandbox" for "7f1c6bf1-6af3-11e8-856b-fa163e3d1891" with KillPodSandboxError: "rpc error: code = DeadlineExceeded desc = context deadline exceeded"
    Jun 08 11:25:22 worker1 kubelet[14339]: E0608 11:25:22.421262   14339 pod_workers.go:182] Error syncing pod 7f1c6bf1-6af3-11e8-856b-fa163e3d1891 ("wordpress-766d75457d-spdrb_default(7f1c6bf1-6af3-11e8-856b-fa163e3d1891)"), skipping: failed to "KillPodSandbox" for "7f1c6bf1-6af3-11e8-856b-fa163e3d1891" with KillPodSandboxError: "rpc error: code = DeadlineExceeded desc = context deadline exceeded"

4 个答案:

答案 0 :(得分:2)

  

创建pod沙箱失败。

......几乎总是CNI失败;我会在节点上检查所有编织容器是否满意,/opt/cni/bin是否存在(或其编织等效物)

您可能必须检查journalctl -u kubelet.service以及docker日志中是否有任何正在运行的容器,以发现节点上错误的完整范围。

答案 1 :(得分:1)

似乎可以删除$KUBELET_NETWORK_ARGS中的/etc/systemd/system/kubelet.service.d/10-kubeadm.conf

我删除了$KUBELET_NETWORK_ARGS并重新启动了工作节点,然后成功部署了pod。

答案 2 :(得分:0)

马修说,这很可能是CNI的失败。

首先,找到此pod正在运行的节点:

kubectl get po wordpress-766d75457d-zlvdn -o wide 

接下来在pod所在的节点中,如果您有多个/etc/cni/net.d,请检查.conf,然后您可以删除一个并重新启动该节点。

来源:https://github.com/kubernetes/kubeadm/issues/578

请注意,这是其中一个解决方案。

答案 3 :(得分:0)

虽然希望这不是其他人的问题,但对我来说,这发生在我的部分文件系统已满时。

我的集群中只有一个节点上有 Pod 卡在 ContainerCreating 中。我也有一堆我希望关闭的豆荚,但没有。有人推荐跑步

sudo systemctl status kubelet -l

它向我展示了一堆像

<块引用>

6 月 18 日 23:19:56 worker01 kubelet[1718]: E0618 23:19:56.461378 1718 kuberuntime_manager.go:647] createPodSandbox for pod "REDACTED(2c681b9c-1907cf59failure) var/log/pods/2c681b9c-cf5b-11eb-9c79-52540077cc53:设备上没有剩余空间

我确认我的空间不足

$ df -h
Filesystem                    Size  Used Avail Use% Mounted on
devtmpfs                      189G     0  189G   0% /dev
tmpfs                         189G     0  189G   0% /sys/fs/cgroup
/dev/mapper/vg01-root          20G  7.0G   14G  35% /
/dev/mapper/vg01-tmp          4.0G   34M  4.0G   1% /tmp
/dev/mapper/vg01-home         4.0G   72M  4.0G   2% /home
/dev/mapper/vg01-varlog        10G   10G   20K 100% /var/log
/dev/mapper/vg01-varlogaudit  2.0G   68M  2.0G   4% /var/log/audit

我只需要清除该目录(并对所有挂起的 pod 和卡住运行的 pod 进行一些手动清理)。