我有一个2个节点的集群(两个都与docker 19.03.1一起使用),我遇到了麻烦-我不断收到
"Pool overlaps with other one on this address space" module=node/agent/taskmanager"
消息。
这群动物是香草制成的
manager@ docker swarm init
worker@ docker swarm join...
我创建了一个覆盖网络
manager@ docker network create -d overlay ids-net --attachable
在两个节点上创建简单服务都可以:
manager@ docker service create --replicas 2 --network ids-net hashicorp/http-echo -text="hello world"
结果:
image hashicorp/http-echo:latest could not be accessed on a registry to record
its digest. Each node will access hashicorp/http-echo:latest independently,
possibly leading to different nodes running different
versions of the image.
ksi39hzojsfjr4jyqck1p4rib
overall progress: 2 out of 2 tasks
1/2: running [==================================================>]
2/2: running [==================================================>]
verify: Service converged
出现以下结果将导致无限循环:
manager@ docker service create --replicas 2 --publish published=5678,target=5678 --network ids-net hashicorp/http-echo -text="hello world"
显示以下内容:
image hashicorp/http-echo:latest could not be accessed on a registry to record
its digest. Each node will access hashicorp/http-echo:latest independently,
possibly leading to different nodes running different
versions of the image.
bjjxxomsgvsoitf55l7vuuz74
overall progress: 0 out of 2 tasks
1/2: Pool overlaps with other one on this address space
2/2: Pool overlaps with other one on this address space
系统日志显示以下内容:
Aug 01 15:47:03 docker dockerd[1106]: time="2019-08-01T15:47:03.283849008+02:00" level=debug msg="state changed" module=node/agent/taskmanager node.id=50coluxfs0lnx1kf07mhckito service.id=d3rxusuxfk18tuvi24l198btp state.desired=READY state.transition="ACCEPTED->PREPARING" task.id=wmu9898y2yl01ga5v40xfojmi
Aug 01 15:47:03 docker dockerd[1106]: time="2019-08-01T15:47:03.283977128+02:00" level=debug msg="(*Agent).UpdateTaskStatus" module=node/agent node.id=50coluxfs0lnx 1kf07mhckito task.id=wmu9898y2yl01ga5v40xfojmi
Aug 01 15:47:03 docker dockerd[1106]: time="2019-08-01T15:47:03.284226242+02:00" level=debug msg="task status reported" module=node/agent node.id=50coluxfs0lnx1kf07mhckito
Aug 01 15:47:03 docker dockerd[1106]: time="2019-08-01T15:47:03.284870334+02:00" level=debug msg="(*Agent).UpdateTaskStatus" module=node/agent node.id=50coluxfs0lnx 1kf07mhckito task.id=o036l4zcbzvnccjsp44fygnfr
Aug 01 15:47:03 docker dockerd[1106]: time="2019-08-01T15:47:03.285156543+02:00" level=debug msg="Allocating IPv4 pools for network ingress (ozjtk12iougu8fqjliqspvxx2)"
Aug 01 15:47:03 docker dockerd[1106]: time="2019-08-01T15:47:03.285200492+02:00" level=debug msg="RequestPool(LocalDefault, 10.255.0.0/16, , map[], false)"
Aug 01 15:47:03 docker dockerd[1106]: time="2019-08-01T15:47:03.285228022+02:00" level=error msg="fatal task error" error="Pool overlaps with other one on this address space" module=node/agent/taskmanager node.id=50coluxfs0lnx1kf07mhckito service.id=d3rxusuxfk18tuvi24l198btp task.id=wmu9898y2yl01ga5v40xfojmi
Aug 01 15:47:03 docker dockerd[1106]: time="2019-08-01T15:47:03.285265876+02:00" level=debug msg="state changed" module=node/agent/taskmanager node.id=50coluxfs0lnx1kf07mhckito service.id=d3rxusuxfk18tuvi24l198btp state.desired=READY state.transition="PREPARING->REJECTED" task.id=wmu9898y2yl01ga5v40xfojmi
Aug 01 15:47:03 docker dockerd[1106]: time="2019-08-01T15:47:03.285236079+02:00" level=debug msg="task status reported" module=node/agent node.id=50coluxfs0lnx1kf07mhckito
Aug 01 15:47:03 docker dockerd[1106]: time="2019-08-01T15:47:03.285726857+02:00" level=debug msg="(*Agent).UpdateTaskStatus" module=node/agent node.id=50coluxfs0lnx 1kf07mhckito task.id=wmu9898y2yl01ga5v40xfojmi
Aug 01 15:47:03 docker dockerd[1106]: time="2019-08-01T15:47:03.286082096+02:00" level=debug msg="task status reported" module=node/agent node.id=50coluxfs0lnx1kf07mhckito
Aug 01 15:47:03 docker dockerd[1106]: time="2019-08-01T15:47:03.286697616+02:00" level=debug msg="(*Agent).UpdateTaskStatus" module=node/agent node.id=50coluxfs0lnx 1kf07mhckito task.id=wmu9898y2yl01ga5v40xfojmi
Aug 01 15:47:03 docker dockerd[1106]: time="2019-08-01T15:47:03.287043607+02:00" level=debug msg="task status reported" module=node/agent node.id=50coluxfs0lnx1kf07mhckito
Aug 01 15:47:03 docker dockerd[1106]: time="2019-08-01T15:47:03.316386815+02:00" level=debug msg="state for task wmu9898y2yl01ga5v40xfojmi updated to REJECTED" method="(*Dispatcher).processUpdates" module=dispatcher node.id=50coluxfs0lnx1kf07mhckito state.transition="ASSIGNED->REJECTED" task.id=wmu9898y2yl01ga5v40xfojmi
我认为覆盖网络存在问题。
经理的manager@ docker inspect <network id>
产生:
{
"Name": "ids-net",
"Id": "jzvu45w1b247whq6qsx3v7fdy",
"Created": "2019-07-31T10:10:38.436102588+02:00",
"Scope": "swarm",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "10.0.1.0/24",
"Gateway": "10.0.1.1"
}
]
},
"Internal": false,
"Attachable": true,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {
...
},
"Options": {
"com.docker.network.driver.overlay.vxlanid_list": "4098"
},
"Labels": {},
"Peers": [
{
"Name": "22a3eb7a8eec",
"IP": "192.168.100.92"
}
]
}
]
在辅助节点上,未创建网络。 (对吗?)
根据@BMitch的请求,这是两台计算机上ip r
的输出:
manager@ ip r
default via 192.168.100.11 dev ens192 onlink
10.255.0.0/24 dev docker0 proto kernel scope link src 10.255.0.1
10.255.1.0/24 dev br-686969a42803 proto kernel scope link src 10.255.1.1
10.255.23.0/24 dev docker_gwbridge proto kernel scope link src 10.255.23.1
10.255.42.0/24 dev br-c03d759e1553 proto kernel scope link src 10.255.42.1
192.168.100.0/24 dev ens192 proto kernel scope link src 192.168.100.92
worker@ ip r
default via 192.168.100.11 dev eth0 onlink
10.254.0.0/24 dev docker0 proto kernel scope link src 10.254.0.1
10.255.3.0/24 dev docker_gwbridge proto kernel scope link src 10.255.3.1
10.255.4.0/24 dev br-88f241f38441 proto kernel scope link src 10.255.4.1
192.168.100.0/24 dev eth0 proto kernel scope link src 192.168.100.106
这是/etc/docker/daemon.json
中的manager
:
manager@ cat /etc/docker/daemon.json
{
"registry-mirrors": ["https://repo.ids.net"],
"default-address-pools": [
{"base":"10.255.255.0/16","size":24}
]
}
worker
的那个看起来与众不同:
manager@ cat /etc/docker/daemon.json
{
"registry-mirrors": ["https://repo.ids.net"],
"default-address-pools": [
{"base":"10.254.255.0/16","size":24}
]
}
这是入口网络配置:
root@docker:/etc/nginx/conf.d# docker network inspect ozjtk12iougu
[
{
"Name": "ingress",
"Id": "ozjtk12iougu8fqjliqspvxx2",
"Created": "2019-07-31T07:36:57.368162913Z",
"Scope": "swarm",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "10.255.0.0/16",
"Gateway": "10.255.0.1"
}
]
},
"Internal": false,
"Attachable": false,
"Ingress": true,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": null,
"Options": {
"com.docker.network.driver.overlay.vxlanid_list": "4096"
},
"Labels": null
}
]
我已多次清除两个系统,并从头开始重新启动了两个服务器。
有人可以引导我吗?
谢谢 M
答案 0 :(得分:1)
使用已发布的端口,docker尝试配置入口网络:
Aug 01 15:47:03 docker dockerd[1106]: time="2019-08-01T15:47:03.285156543+02:00" level=debug msg="Allocating IPv4 pools for network ingress (ozjtk12iougu8fqjliqspvxx2)"
Aug 01 15:47:03 docker dockerd[1106]: time="2019-08-01T15:47:03.285200492+02:00" level=debug msg="RequestPool(LocalDefault, 10.255.0.0/16, , map[], false)"
这似乎是入口子网的整个/16
,并且与docker在该/16
块内已分配的其他子网重叠:
10.255.0.0/24 dev docker0 proto kernel scope link src 10.255.0.1
10.255.1.0/24 dev br-686969a42803 proto kernel scope link src 10.255.1.1
10.255.23.0/24 dev docker_gwbridge proto kernel scope link src 10.255.23.1
10.255.42.0/24 dev br-c03d759e1553 proto kernel scope link src 10.255.42.1
最好的猜测是您以不兼容的方式配置了网络和默认地址池。可以通过手动创建ingress
网络来完成此操作,并且/etc/docker/daemon.json
文件中可能还存在导致这些冲突的设置。
编辑:根据您的更新,此猜测似乎正确。您已将docker的三个不同部分配置为使用相同的地址空间,这将导致冲突。入口网络就是其中一部分,它只是为自己使用整个地址空间。您应该配置桥接网络(在daemon.json中使用默认地址池),覆盖网络(在docker swarm init
中使用您几乎可以肯定通过的地址池选项)以及在入口网络上(您可能手动创建) ),每个都有一个单独的不重叠的CIDR块。