重新启动作为集群工作人员的Windows服务器后,一旦服务器和docker守护程序重新联机,Windows容器将无限期地陷入“准备”状态。
https://user-images.githubusercontent.com/4528753/65180353-4e5d6e80-da22-11e9-8060-451150865177.png
memberlist: Refuting a suspect message (from: c9347e85405d)
memberlist: Failed to send ping: write udp 10.60.3.40:7946->10.60.3.110:7946: wsasendto: The requested address is not valid in its
context.
grpc: addrConn.createTransport failed to connect to {10.60.3.110:2377 0 <nil>}. Err :connection error: desc = "transport: Error while
dialing dial tcp 10.60.3.110:2377: connectex: A socket operation was attempted to an unreachable host.". Reconnecting... [module=grpc]
memberlist: Failed to send ping: write udp 10.60.3.40:7946->10.60.3.186:7946: wsasendto: The requested address is not valid in its
context.
grpc: addrConn.createTransport failed to connect to {10.60.3.110:2377 0 <nil>}. Err :connection error: desc = "transport: Error while
dialing dial tcp 10.60.3.110:2377: connectex: A socket operation was attempted to an unreachable host.". Reconnecting... [module=grpc]
agent: session failed [node.id=wuhifvg9li3v5zuq2xu7c6hxa module=node/agent error=rpc error: code = Unavailable desc = all SubConns are
in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 10.60.3.69:2377:
connectex: A socket operation was attempted to an unreachable host." backoff=6.3s]
Failed to send gossip to 10.60.3.110: write udp 10.60.3.40:7946->10.60.3.110:7946: wsasendto: The requested address is not valid in its
context.
Failed to send gossip to 10.60.3.69: write udp 10.60.3.40:7946->10.60.3.69:7946: wsasendto: The requested address is not valid in its
context.
Failed to send gossip to 10.60.3.105: write udp 10.60.3.40:7946->10.60.3.105:7946: wsasendto: The requested address is not valid in its
context.
Failed to send gossip to 10.60.3.69: write udp 10.60.3.40:7946->10.60.3.69:7946: wsasendto: The requested address is not valid in its
context.
Failed to send gossip to 10.60.3.186: write udp 10.60.3.40:7946->10.60.3.186:7946: wsasendto: The requested address is not valid in its
context.
Failed to send gossip to 10.60.3.105: write udp 10.60.3.40:7946->10.60.3.105:7946: wsasendto: The requested address is not valid in its
context.
Failed to send gossip to 10.60.3.186: write udp 10.60.3.40:7946->10.60.3.186:7946: wsasendto: The requested address is not valid in its
context.
Failed to send gossip to 10.60.3.69: write udp 10.60.3.40:7946->10.60.3.69:7946: wsasendto: The requested address is not valid in its
context.
Failed to send gossip to 10.60.3.105: write udp 10.60.3.40:7946->10.60.3.105:7946: wsasendto: The requested address is not valid in its
context.
Failed to send gossip to 10.60.3.109: write udp 10.60.3.40:7946->10.60.3.109:7946: wsasendto: The requested address is not valid in its
context.
Failed to send gossip to 10.60.3.69: write udp 10.60.3.40:7946->10.60.3.69:7946: wsasendto: The requested address is not valid in its
context.
Failed to send gossip to 10.60.3.110: write udp 10.60.3.40:7946->10.60.3.110:7946: wsasendto: The requested address is not valid in its
context.
memberlist: Failed to send gossip to 10.60.3.105:7946: write udp 10.60.3.40:7946->10.60.3.105:7946: wsasendto: The requested address is
not valid in its context.
memberlist: Failed to send gossip to 10.60.3.186:7946: write udp 10.60.3.40:7946->10.60.3.186:7946: wsasendto: The requested address is
not valid in its context.
例如,许多错误都是奇怪的... 7946在群集节点之间完全开放,telnet确认了这一点。
我希望看到docker服务容器能迅速启动,并且不会停留在“准备”状态。 Docker映像已被拉出,应该很快。
Client: Docker Engine - Enterprise
Version: 19.03.2
API version: 1.40
Go version: go1.12.8
Git commit: c92ab06ed9
Built: 09/03/2019 16:38:11
OS/Arch: windows/amd64
Experimental: false
Server: Docker Engine - Enterprise
Engine:
Version: 19.03.2
API version: 1.40 (minimum version 1.24)
Go version: go1.12.8
Git commit: c92ab06ed9
Built: 09/03/2019 16:35:47
OS/Arch: windows/amd64
Experimental: false
Client:
Debug Mode: false
Plugins:
cluster: Manage Docker clusters (Docker Inc., v1.1.0-8c33de7)
Server:
Containers: 4
Running: 0
Paused: 0
Stopped: 4
Images: 4
Server Version: 19.03.2
Storage Driver: windowsfilter
Windows:
Logging Driver: json-file
Plugins:
Volume: local
Network: ics l2bridge l2tunnel nat null overlay transparent
Log: awslogs etwlogs fluentd gcplogs gelf json-file local logentries splunk syslog
Swarm: active
NodeID: wuhifvg9li3v5zuq2xu7c6hxa
Is Manager: false
Node Address: 10.60.3.40
Manager Addresses:
10.60.3.110:2377
10.60.3.186:2377
10.60.3.69:2377
Default Isolation: process
Kernel Version: 10.0 18362 (18362.1.amd64fre.19h1_release.190318-1202)
Operating System: Windows Server Datacenter Version 1903 (OS Build 18362.356)
OSType: windows
Architecture: x86_64
CPUs: 4
Total Memory: 8GiB
Name: SWARMWORKER1
ID: V2WJ:OEUM:7TUQ:WPIO:UOK4:IAHA:KWMN:RQFF:CAUO:LUB6:DJIJ:OVBX
Docker Root Dir: E:\docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Product License: this node is not a swarm manager - check license status on a manager node
我发布了关于Moby回购的问题,但从未听说过: https://github.com/moby/moby/issues/39955
我发现暂时解决此问题的唯一方法是从群集中清空节点,删除docker文件,重新安装Windows“容器”功能,然后重新加入群集。但是,重新启动后会再次发生。
有趣的是,当我在Windows Worker上看到一个处于“正在准备”状态的群集任务时,服务器似乎根本没有做任何事情,就像经理认为该工人正在准备容器一样,但是不是...
有人有什么建议吗?