Kubernetes:runContainer:API错误(500):无法启动容器(docker无法卸载)

时间:2016-08-10 13:27:05

标签: docker kubernetes google-kubernetes-engine

有时pod创建失败,我们的GKE集群出现500错误:

1m        1m        1         installer-u57ab1f7707b03   Pod                 Normal    Scheduled    {default-scheduler }                                       Successfully assigned installer-u57ab1f7707b03 to gke-oro-cloud-v1-1445426963-ffbcc283-node-bo1l
1m        1m        1         installer-u57ab1f7707b03   Pod                 Warning   FailedSync   {kubelet gke-oro-cloud-v1-1445426963-ffbcc283-node-bo1l}   Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: API error (500): Cannot start container ff8573fbf0b90a25b5565b1feb36671f13367115dde74e581cf249be772d8e4e: [8] System error: read parent: connection reset by peer\n"
1m        1m        1         installer-u57ab1f7707b03   Pod                 Warning   FailedSync   {kubelet gke-oro-cloud-v1-1445426963-ffbcc283-node-bo1l}   Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: API error (500): Cannot start container fbd7151d4489ed3ac9b21ef9ee3268039374fe3aee1f5933dc27d003f5388e7d: [8] System error: read parent: connection reset by peer\n"
1m        1m        1         installer-u57ab1f7707b03   Pod                 Warning   FailedSync   {kubelet gke-oro-cloud-v1-1445426963-ffbcc283-node-bo1l}   Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: API error (500): Cannot start container c6b7969fd036fd187f8b5b815106887d718780b290b81e6dde12162d15c22728: [8] System error: read parent: connection reset by peer\n"
49s       49s       1         installer-u57ab1f7707b03   Pod                 Warning   FailedSync   {kubelet gke-oro-cloud-v1-1445426963-ffbcc283-node-bo1l}   Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: API error (500): Cannot start container 5b0d78ee31759a3472f15fe375ef4f2542dcc65518023a1bd06593fe7d28a448: [8] System error: read parent: connection reset by peer\n"
32s       32s       1         installer-u57ab1f7707b03   Pod                 Warning   FailedSync   {kubelet gke-oro-cloud-v1-1445426963-ffbcc283-node-bo1l}   Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: API error (500): Cannot start container 7ff5941a30ce432aa1b1382e4b20d272a08a7113f79f7f1ff2f8898a00ca8f06: [8] System error: read parent: connection reset by peer\n"
18s       18s       1         installer-u57ab1f7707b03   Pod                 Warning   FailedSync   {kubelet gke-oro-cloud-v1-1445426963-ffbcc283-node-bo1l}   Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: API error (500): Cannot start container a91ae7d6dc9dee5196e73457d817bc46f8009c26147cc81727920aebfa52cc38: [8] System error: read parent: connection reset by peer\n"
2s        2s        1         installer-u57ab1f7707b03   Pod                 Warning   FailedSync   {kubelet gke-oro-cloud-v1-1445426963-ffbcc283-node-bo1l}   Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: API error (500): Cannot start container ad8b7bbe72410232d7fe6197e057d15e9003e24f6d8aad15bc7068430cfea508: [8] System error: read parent: connection reset by peer\n"

在docker.log中我找到了:

time="2016-08-10T12:37:24.458097892Z" level=warning msg="failed to cleanup ipc mounts:\nfailed to umount /var/lib/docker/containers/ad8b7bbe72410232d7fe6197e057d15e9003e24f6d8aad15bc7068430cfea508/shm: invalid argument\nfailed to umount /var/lib/docker/containers/ad8b7bbe72410232d7fe6197e057d15e9003e24f6d8aad15bc7068430cfea508/mqueue: invalid argument"
time="2016-08-10T12:37:24.458280187Z" level=error msg="Handler for POST /containers/ad8b7bbe72410232d7fe6197e057d15e9003e24f6d8aad15bc7068430cfea508/start returned error: Cannot start container ad8b7bbe72410232d7fe6197e057d15e9003e24f6d8aad15bc7068430cfea508: [8] System error: read parent: connection reset by peer"
time="2016-08-10T12:37:24.458315257Z" level=error msg="HTTP Error" err="Cannot start container ad8b7bbe72410232d7fe6197e057d15e9003e24f6d8aad15bc7068430cfea508: [8] System error: read parent: connection reset by peer" statusCode=500
time="2016-08-10T12:37:40.151776337Z" level=warning msg="signal: killed" 

Kubernetes版本v1.2.5
Docker版本1.9.1

任何想法如何解决?

1 个答案:

答案 0 :(得分:2)

这可能是由于Docker 1.9中的FFMPEG容器读取其配置,但在父写完之前关闭了读取管道。

Docker 1.10中包含固定的runc。 Kubernetes 1.3使用Docker 1.11.2,但在升级之前,您可以通过runc bug解决问题到容器的命令行。