因此,目标是建立一个由4个raspis,1个master和3个worker组成的kubernetes集群。我正在遵循this指南。初始设置后,群集工作正常,但在重新启动后变得无用。经过一番调查后,我发现docker守护进程在重新启动后无法重新启动存在问题,这导致必需的kubernetes容器无法启动。重新启动后,我的文件系统也进入只读模式。 sudo service docker status
的输出显示以下内容
docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2019-10-03 12:23:57 CEST; 21min ago
Docs: https://docs.docker.com
Process: 1126 ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock (code=exited, status=1/
Main PID: 1126 (code=exited, status=1/FAILURE)
Oct 03 12:23:57 k8smaster-2 systemd[1]: docker.service: Service RestartSec=2s expired, scheduling restart.
Oct 03 12:23:57 k8smaster-2 systemd[1]: docker.service: Scheduled restart job, restart counter is at 3.
Oct 03 12:23:57 k8smaster-2 systemd[1]: Stopped Docker Application Container Engine.
Oct 03 12:23:57 k8smaster-2 systemd[1]: docker.service: Start request repeated too quickly.
Oct 03 12:23:57 k8smaster-2 systemd[1]: docker.service: Failed with result 'exit-code'.
Oct 03 12:23:57 k8smaster-2 systemd[1]: Failed to start Docker Application Container Engine.
尝试运行任何docker命令会导致
ERROR: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
,并通过curl -sSL get.docker.com | sh && sudo usermod pi -aG docker && newgrp docker
安装了docker。
我什至无法卸载它,因为它不是通过apt-get安装的:
sudo apt-get remove docker
Reading package lists... Done
Building dependency tree
Reading state information... Done
Package 'docker' is not installed, so not removed
0 upgraded, 0 newly installed, 0 to remove and 62 not upgraded.
journalctl -xe
ist
The unit docker.service has entered the 'failed' state with result 'exit-code'.
Oct 03 12:23:57 k8smaster-2 systemd[1]: Failed to start Docker Application Container Engine.
-- Subject: A start job for unit docker.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- A start job for unit docker.service has finished with a failure.
--
-- The job identifier is 1060 and the job result is failed.
Oct 03 12:23:57 k8smaster-2 systemd[1]: docker.socket: Failed with result 'service-start-limit-hit'.
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
我用Google搜索了遇到的各种错误,但是它们只导致github错误,该错误已经2岁了,并且没有解决或通过无济于事的解决方案解决了。 (参考this,this和this)
我也尝试过sudo systemctl enable docker
来自动启动,但是我认为这不是问题所在。它看起来像是可以通过全新安装解决的配置问题,如果我要运行应该能够正常关闭的kubernetes集群,则需要避免此问题。
我真的希望有人能帮助我。