我有3名经理,3名工人,如下所示:
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
ocnuul8dcbrf4gjtdzv06t0yf * manager1 Ready Active Leader 18.06.0-ce
z297dhtfon50pt4hllu4qfz6i manager2 Ready Active Reachable 18.06.0-ce
ondpdzyq06pd3oysn34p4xi9o manager3 Ready Active Reachable 18.06.0-ce
0bls0g65gee1wbv7wr6rwgbjk worker1 Ready Active 18.06.0-ce
mxtg28slr5rvljrayaf4k1wkk worker2 Ready Active 18.06.0-ce
hqu1436bvbar9srbr34er3fl4 worker3 Ready Active 18.06.0-ce
所有经理都可用。
但是,当我在集群上部署服务时,manager3处于准备状态
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
lmhpsgeqax13 web-fe.1 nigelpoulton/pluralsight-docker-ci:latest worker1 Running Running 19 minutes ago
nivas3gkh0pa web-fe.2 nigelpoulton/pluralsight-docker-ci:latest worker3 Running Running 19 minutes ago
5plwh46jri3t web-fe.3 nigelpoulton/pluralsight-docker-ci:latest worker2 Running Running 19 minutes ago
l1ykqzgzbgmb web-fe.4 nigelpoulton/pluralsight-docker-ci:latest manager2 Running Running 19 minutes ago
q788hrm6rba9 web-fe.5 nigelpoulton/pluralsight-docker-ci:latest manager3 Running Preparing 21 minutes ago
我可以在/var/log/docker.log中找到manager3,它在尝试与manager2的IP(192.168.99.105:2377)建立连接时失败了
7T00:10:54.230023789Z" level=warning msg="grpc: addrConn.createTransport failed to connect to {192.168.99.105:2377 0 <nil>}. Err :connection error: desc = \"transport: Err7T00:10:54.230049538Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420a86940, TRANSIENT_FAILURE" module=grpc
由于manager1是领导者,我期望它在准备时将消息/信号发送给manager1,但是我不明白为什么它试图连接到manager2。 有人可以帮助我理解吗?另外,我如何从中恢复并使Manager3从准备状态变为运行状态?
致谢