有时pod创建失败,我们的GKE集群出现500错误:
1m 1m 1 installer-u57ab1f7707b03 Pod Normal Scheduled {default-scheduler } Successfully assigned installer-u57ab1f7707b03 to gke-oro-cloud-v1-1445426963-ffbcc283-node-bo1l
1m 1m 1 installer-u57ab1f7707b03 Pod Warning FailedSync {kubelet gke-oro-cloud-v1-1445426963-ffbcc283-node-bo1l} Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: API error (500): Cannot start container ff8573fbf0b90a25b5565b1feb36671f13367115dde74e581cf249be772d8e4e: [8] System error: read parent: connection reset by peer\n"
1m 1m 1 installer-u57ab1f7707b03 Pod Warning FailedSync {kubelet gke-oro-cloud-v1-1445426963-ffbcc283-node-bo1l} Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: API error (500): Cannot start container fbd7151d4489ed3ac9b21ef9ee3268039374fe3aee1f5933dc27d003f5388e7d: [8] System error: read parent: connection reset by peer\n"
1m 1m 1 installer-u57ab1f7707b03 Pod Warning FailedSync {kubelet gke-oro-cloud-v1-1445426963-ffbcc283-node-bo1l} Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: API error (500): Cannot start container c6b7969fd036fd187f8b5b815106887d718780b290b81e6dde12162d15c22728: [8] System error: read parent: connection reset by peer\n"
49s 49s 1 installer-u57ab1f7707b03 Pod Warning FailedSync {kubelet gke-oro-cloud-v1-1445426963-ffbcc283-node-bo1l} Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: API error (500): Cannot start container 5b0d78ee31759a3472f15fe375ef4f2542dcc65518023a1bd06593fe7d28a448: [8] System error: read parent: connection reset by peer\n"
32s 32s 1 installer-u57ab1f7707b03 Pod Warning FailedSync {kubelet gke-oro-cloud-v1-1445426963-ffbcc283-node-bo1l} Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: API error (500): Cannot start container 7ff5941a30ce432aa1b1382e4b20d272a08a7113f79f7f1ff2f8898a00ca8f06: [8] System error: read parent: connection reset by peer\n"
18s 18s 1 installer-u57ab1f7707b03 Pod Warning FailedSync {kubelet gke-oro-cloud-v1-1445426963-ffbcc283-node-bo1l} Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: API error (500): Cannot start container a91ae7d6dc9dee5196e73457d817bc46f8009c26147cc81727920aebfa52cc38: [8] System error: read parent: connection reset by peer\n"
2s 2s 1 installer-u57ab1f7707b03 Pod Warning FailedSync {kubelet gke-oro-cloud-v1-1445426963-ffbcc283-node-bo1l} Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: API error (500): Cannot start container ad8b7bbe72410232d7fe6197e057d15e9003e24f6d8aad15bc7068430cfea508: [8] System error: read parent: connection reset by peer\n"
在docker.log中我找到了:
time="2016-08-10T12:37:24.458097892Z" level=warning msg="failed to cleanup ipc mounts:\nfailed to umount /var/lib/docker/containers/ad8b7bbe72410232d7fe6197e057d15e9003e24f6d8aad15bc7068430cfea508/shm: invalid argument\nfailed to umount /var/lib/docker/containers/ad8b7bbe72410232d7fe6197e057d15e9003e24f6d8aad15bc7068430cfea508/mqueue: invalid argument"
time="2016-08-10T12:37:24.458280187Z" level=error msg="Handler for POST /containers/ad8b7bbe72410232d7fe6197e057d15e9003e24f6d8aad15bc7068430cfea508/start returned error: Cannot start container ad8b7bbe72410232d7fe6197e057d15e9003e24f6d8aad15bc7068430cfea508: [8] System error: read parent: connection reset by peer"
time="2016-08-10T12:37:24.458315257Z" level=error msg="HTTP Error" err="Cannot start container ad8b7bbe72410232d7fe6197e057d15e9003e24f6d8aad15bc7068430cfea508: [8] System error: read parent: connection reset by peer" statusCode=500
time="2016-08-10T12:37:40.151776337Z" level=warning msg="signal: killed"
Kubernetes版本v1.2.5
Docker版本1.9.1
任何想法如何解决?