Docker被困在构建和其他命令上

时间:2019-10-29 02:26:13

标签: docker

有一个非常奇怪的问题,docker守护程序无法响应构建请求和其他命令,例如docker info但是,如果我从存在的映像Dockerfile FROM some/exist/image构建映像,则它运行良好,docker version也运行良好。我们有大约56个VM来构建映像,build命令只是像这样::

export DOCKER_HOST = '<my_host>'
docker build -t <tag> - < context.tar.gz

下面的客户端输出,似乎卡在了提取上:

[test]$ docker build -t test .
Sending build context to Docker daemon  2.048kB
Step 1/2 : FROM busybox
latest: Pulling from library/busybox
7c9d20b9b6cd: Extracting  32.77kB/760.8kB
^C

但是cpu和内存都没问题,磁盘空间仍然很大。

[~]# top
top - 10:18:09 up 48 days, 20:20,  1 user,  load average: 1.00, 1.03, 1.05
Tasks: 167 total,   1 running, 166 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.7 us,  0.3 sy,  0.0 ni, 99.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  8006376 total,   603984 free,   612016 used,  6790376 buff/cache
KiB Swap:  4192928 total,  4186560 free,     6368 used.  6749588 avail Mem

日志:

我用systemctl启动了dockerd,奇怪的是我看不到有关dockerd的最新日志,看来dockerd从16:04:08开始卡住了:

cat /var/log/messages|grep docker
2019-10-28 16:04:07 dockerd[1549]: time="2019-10-28T16:04:07+08:00" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/d584a3957292ac1d3edfea2ec57abcf2dbe79465f9bcd41183b0ccd9dac3bb01/shim.sock" debug=false pid=18014
2019-10-28 16:04:08 dockerd[1549]: time="2019-10-28T16:04:08+08:00" level=info msg="shim reaped" id=d584a3957292ac1d3edfea2ec57abcf2dbe79465f9bcd41183b0ccd9dac3bb01
2019-10-28 16:04:08 dockerd[1549]: time="2019-10-28T16:04:08.081553011+08:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
2019-10-28 19:15:00 dockerd[1549]: time="2019-10-28T19:15:00.826993765+08:00" level=error msg="Not continuing with pull after error: context canceled"
2019-10-28 19:17:49 dockerd[1549]: time="2019-10-28T19:17:49.216471516+08:00" level=error msg="Handler for POST /v1.38/build returned error: Error processing tar file(exit status 1): unexpected EOF"
2019-10-28 19:42:18 dockerd[1549]: time="2019-10-28T19:42:18.954682094+08:00" level=error msg="Not continuing with pull after error: context canceled"
2019-10-28 19:51:51 dockerd[1549]: time="2019-10-28T19:51:51.611278576+08:00" level=info msg="Attempting next endpoint for pull after error: manifest unknown: manifest unknown"
2019-10-28 19:52:03 dockerd[1549]: time="2019-10-28T19:52:03.076089163+08:00" level=info msg="Attempting next endpoint for pull after error: manifest unknown: manifest unknown"
2019-10-28 19:55:41 dockerd[1549]: time="2019-10-28T19:55:41.605964950+08:00" level=info msg="Attempting next endpoint for pull after error: Get https://xx/centos/manifests/6: no basic auth credentials"
2019-10-28 19:55:44 dockerd[1549]: time="2019-10-28T19:55:44.298856115+08:00" level=info msg="Attempting next endpoint for pull after error: Get https://xx/centos/manifests/6: no basic auth credentials"

Docker进程:

[~]# ps -ef|grep docker
root      1549     1  6 Oct09 ?        1-05:45:10 /usr/bin/dockerd
root      1557  1549  0 Oct09 ?        00:42:20 docker-containerd --config /var/run/docker/containerd/containerd.toml
root      2574  1549  0 Oct09 ?        00:00:01 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 8090 -container-ip x.x.x.x -container-port 8080
root      2581  1557  0 Oct09 ?        00:00:41 docker-containerd-shim -namespace moby -workdir /opt/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/56e9432a55304bc61d284d3a9da15272d292c1493d33d4499e481a0d64ff53e4 -address /var/run/docker/containerd/docker-containerd.sock -containerd-binary /usr/bin/docker-containerd -runtime-root /var/run/docker/runtime-runc
root     15630 15049  0 10:13 pts/0    00:00:00 grep --color=auto docker

Docker版本:

[~]$ docker version
Client:
 Version:           18.06.1-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        e68fc7a
 Built:             Tue Aug 21 17:23:03 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Proc Stack

[root@vm ~]# cat /proc/1549/stack
[<ffffffff810f8564>] futex_wait_queue_me+0xc4/0x120
[<ffffffff810f90d9>] futex_wait+0x179/0x280
[<ffffffff810fb1de>] do_futex+0xfe/0x5b0
[<ffffffff810fb710>] SyS_futex+0x80/0x180
[<ffffffff8169d53d>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

我已经看到了诸如https://github.com/docker/for-win/issues/813https://github.com/moby/moby/issues/12823之类的github问题,但是我找不到解决该问题的正确方法。 我怎么解决这个问题?任何帮助将不胜感激。

系统

Centos
Kernel: 3.10.0-514.41.1.el7 x86_64 GNU/Linux

2 个答案:

答案 0 :(得分:2)

很难重现您的问题。但是这里有一些想法,如何解决问题。

  1. dockerd状态错误
    有时通过systemctl restart dockerd重新启动可以解决问题。
  2. dockerd已损坏
    我看到了docker安装损坏的问题。这会导致奇怪的行为,他们永远都不会1:1复制。只需在机器上重新安装docker。
  3. 其他
    许多其他因素也会导致此问题。您的centos可能存在问题。在您的其中一个链接中,有人发现防病毒系统是不良系统(请参阅https://github.com/docker/for-win/issues/813#issuecomment-431031402)。

答案 1 :(得分:0)

您应该以调试模式启动docker守护程序,例如

with open('file_name.csv', 'w') as csvfile:
    writer = csv.writer(csvfile)
    for obj in YourModel.objects.values_list():
        row = list(obj)
        writer.writerow(row)

仅供参考,configure and debug docker daemon