我的pod正面临一个奇怪的问题。我在我的env中启动了大约20个pod,每次随机发送3-4个pod,其中Init:0/1状态。在检查pod的状态时,Init容器显示运行状态,该状态应在任务完成后终止,app容器显示Waiting / Pod Initializing阶段。所有20个pod中都使用相同的init容器映像和规范,但每次都会出现一些随机pod的问题。在终止这些卡住的吊舱时,它停留在终止状态。如果我在启动此pod的节点上运行ssh并运行docker ps,它会显示初始容器处于运行状态但是在运行docker exec时会抛出容器不存在的错误。这个初始化容器从Consul Server中提取配置并检查卷(从docker inspect获取),我发现它已正确地拉出了所有的key-val对并将其保存在定义的文件名中。我已经检查了所有节点上的资源,并且所有节点上都有足够的资源。
下面是关于这样的pod的详细示例。
Kubectl版本:
kubectl version
Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.0", GitCommit:"925c127ec6b946659ad0fd596fa959be43f0cc05", GitTreeState:"clean", BuildDate:"2017-12-15T21:07:38Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.2", GitCommit:"5fa2db2bd46ac79e5e00a4e6ed24191080aa463b", GitTreeState:"clean", BuildDate:"2018-01-18T09:42:01Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
Pods:
kubectl get pods -n dev1|grep -i session-service
session-service-app-75c9c8b5d9-dsmhp 0/1 Init:0/1 0 10h
session-service-app-75c9c8b5d9-vq98k 0/1 Terminating 0 11h
Pods状态:
kubectl describe pods session-service-app-75c9c8b5d9-dsmhp -n dev1
Name: session-service-app-75c9c8b5d9-dsmhp
Namespace: dev1
Node: ip-192-168-44-18.ap-southeast-1.compute.internal/192.168.44.18
Start Time: Fri, 27 Apr 2018 18:14:43 +0530
Labels: app=session-service-app
pod-template-hash=3175746185
release=session-service-app
Status: Pending
IP: 100.96.4.240
Controlled By: ReplicaSet/session-service-app-75c9c8b5d9
Init Containers:
initpullconsulconfig:
Container ID: docker://c658d59995636e39c9d03b06e4973b6e32f818783a21ad292a2cf20d0e43bb02
Image: shr-u-nexus-01.myops.de:8082/utils/app-init:1.0
Image ID: docker-pullable://shr-u-nexus-01.myops.de:8082/utils/app-init@sha256:7b0692e3f2e96c6e54c2da614773bb860305b79922b79642642c4e76bd5312cd
Port: <none>
Args:
-consul-addr=consul-server.consul.svc.cluster.local:8500
State: Running
Started: Fri, 27 Apr 2018 18:14:44 +0530
Ready: False
Restart Count: 0
Environment:
CONSUL_TEMPLATE_VERSION: 0.19.4
POD: sand
SERVICE: session-service-app
ENV: dev1
Mounts:
/var/lib/app from shared-volume-sidecar (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-bthkv (ro)
Containers:
session-service-app:
Container ID:
Image: shr-u-nexus-01.myops.de:8082/sand-images/sessionservice-init:sitv12
Image ID:
Port: 8080/TCP
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/etc/appenv from shared-volume-sidecar (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-bthkv (ro)
Conditions:
Type Status
Initialized False
Ready False
PodScheduled True
Volumes:
shared-volume-sidecar:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
default-token-bthkv:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-bthkv
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
节点上的容器状态:
sudo docker ps|grep -i session
c658d5999563 shr-u-nexus-01.myops.de:8082/utils/app-init@sha256:7b0692e3f2e96c6e54c2da614773bb860305b79922b79642642c4e76bd5312cd "/usr/bin/consul-t..." 10 hours ago Up 10 hours k8s_initpullconsulconfig_session-service-app-75c9c8b5d9-dsmhp_dev1_c2075f2a-4a18-11e8-88e7-02929cc89ab6_0
da120abd3dbb gcr.io/google_containers/pause-amd64:3.0 "/pause" 10 hours ago Up 10 hours k8s_POD_session-service-app-75c9c8b5d9-dsmhp_dev1_c2075f2a-4a18-11e8-88e7-02929cc89ab6_0
f53d48c7d6ec shr-u-nexus-01.myops.de:8082/utils/app-init@sha256:7b0692e3f2e96c6e54c2da614773bb860305b79922b79642642c4e76bd5312cd "/usr/bin/consul-t..." 10 hours ago Up 10 hours k8s_initpullconsulconfig_session-service-app-75c9c8b5d9-vq98k_dev1_42837d12-4a12-11e8-88e7-02929cc89ab6_0
c26415458d39 gcr.io/google_containers/pause-amd64:3.0 "/pause" 10 hours ago Up 10 hours k8s_POD_session-service-app-75c9c8b5d9-vq98k_dev1_42837d12-4a12-11e8-88e7-02929cc89ab6_0
运行Docker exec(与kubectl exec相同的结果):
sudo docker exec -it c658d5999563 bash
rpc error: code = 2 desc = containerd: container not found