在Kubernetes中使用“对等名称冲突”设置um WeaveNet的错误

时间:2018-06-13 17:54:00

标签: kubernetes traefik

我正在设置一个Kubernetes-Cluser,无法正常使用编织网络。

我有3个节点:rowlf(master),rizzo和fozzie。豆荚看起来很好:

NAMESPACE     NAME                                READY     STATUS    RESTARTS   AGE
kube-system   pod/etcd-rowlf                      1/1       Running   0          32m
kube-system   pod/kube-apiserver-rowlf            1/1       Running   9          33m
kube-system   pod/kube-controller-manager-rowlf   1/1       Running   0          32m
kube-system   pod/kube-dns-686d6fb9c-kjdxt        3/3       Running   0          33m
kube-system   pod/kube-proxy-6kpr9                1/1       Running   0          9m
kube-system   pod/kube-proxy-f7nk5                1/1       Running   0          33m
kube-system   pod/kube-proxy-nrbbl                1/1       Running   0          21m
kube-system   pod/kube-scheduler-rowlf            1/1       Running   0          32m
kube-system   pod/weave-net-4sj4n                 2/2       Running   1          21m
kube-system   pod/weave-net-kj6q7                 2/2       Running   1          9m
kube-system   pod/weave-net-nsp22                 2/2       Running   0          30m

但编织状态显示失败:

$ kubectl exec -n kube-system weave-net-nsp22 -c weave -- /home/weave/weave --local status

Version: 2.3.0 (up to date; next check at 2018/06/14 00:30:09)

Service: router
Protocol: weave 1..2
Name: 7a:8f:22:1f:0a:17(rowlf)
Encryption: disabled
PeerDiscovery: enabled
Targets: 1
Connections: 1 (1 failed)
Peers: 1
TrustedSubnets: none

Service: ipam
Status: ready
Range: 10.32.0.0/12
DefaultSubnet: 10.32.0.0/12

首先,我不明白为什么连接被标记为失败。在日志中我发现了这两行:

INFO: 2018/06/13 17:22:59.170536 ->[172.16.20.12:54077] connection accepted
INFO: 2018/06/13 17:22:59.480262 ->[172.16.20.12:54077|7a:8f:22:1f:0a:17(rowlf)]: connection shutting down due to error: local "7a:8f:22:1f:0a:17(rowlf)" and remote "7a:8f:22:1f:0a:17(rizzo)" peer names collision
INFO: 2018/06/13 17:34:12.668693 ->[172.16.20.13:52541] connection accepted
INFO: 2018/06/13 17:34:12.672113 ->[172.16.20.13:52541|7a:8f:22:1f:0a:17(rowlf)]: connection shutting down due to error: local "7a:8f:22:1f:0a:17(rowlf)" and remote "7a:8f:22:1f:0a:17(fozzie)" peer names collision

第二个被误解的是“对等名称冲突”错误。这是正常的吗?

这是“rizzo”的日志

kubectl logs weave-net-4sj4n -n kube-system weave

DEBU: 2018/06/13 17:22:58.731864 [kube-peers] Checking peer "7a:8f:22:1f:0a:17" against list &{[{7a:8f:22:1f:0a:17 rowlf}]}
INFO: 2018/06/13 17:22:58.833350 Command line options: map[conn-limit:100 docker-api: host-root:/host http-addr:127.0.0.1:6784 ipalloc-range:10.32.0.0/12 no-dns:true expect-npc:true name:7a:8f:22:1f:0a:17 datapath:datapath db-prefix:/weavedb/weave-net ipalloc-init:consensus=2 metrics-addr:0.0.0.0:6782 nickname:rizzo port:6783]
INFO: 2018/06/13 17:22:58.833525 weave  2.3.0
INFO: 2018/06/13 17:22:59.119956 Bridge type is bridged_fastdp
INFO: 2018/06/13 17:22:59.120025 Communication between peers is unencrypted.
INFO: 2018/06/13 17:22:59.141576 Our name is 7a:8f:22:1f:0a:17(rizzo)
INFO: 2018/06/13 17:22:59.141787 Launch detected - using supplied peer list: [172.16.20.12 172.16.20.11]
INFO: 2018/06/13 17:22:59.141894 Checking for pre-existing addresses on weave bridge
INFO: 2018/06/13 17:22:59.157517 [allocator 7a:8f:22:1f:0a:17] Initialising with persisted data
INFO: 2018/06/13 17:22:59.157884 Sniffing traffic on datapath (via ODP)
INFO: 2018/06/13 17:22:59.158806 ->[172.16.20.11:6783] attempting connection
INFO: 2018/06/13 17:22:59.159081 ->[172.16.20.12:6783] attempting connection
INFO: 2018/06/13 17:22:59.159815 ->[172.16.20.12:42371] connection accepted
INFO: 2018/06/13 17:22:59.161572 ->[172.16.20.12:6783|7a:8f:22:1f:0a:17(rizzo)]: connection shutting down due to error: cannot connect to ourself
INFO: 2018/06/13 17:22:59.161836 ->[172.16.20.12:42371|7a:8f:22:1f:0a:17(rizzo)]: connection shutting down due to error: cannot connect to ourself
INFO: 2018/06/13 17:22:59.265736 Listening for HTTP control messages on 127.0.0.1:6784
INFO: 2018/06/13 17:22:59.266483 Listening for metrics requests on 0.0.0.0:6782
INFO: 2018/06/13 17:22:59.443937 ->[172.16.20.11:6783|7a:8f:22:1f:0a:17(rizzo)]: connection shutting down due to error: local "7a:8f:22:1f:0a:17(rizzo)" and remote "7a:8f:22:1f:0a:17(rowlf)" peer names collision
INFO: 2018/06/13 17:23:00.355761 [kube-peers] Added myself to peer list &{[{7a:8f:22:1f:0a:17 rowlf}]}
DEBU: 2018/06/13 17:23:00.367309 [kube-peers] Nodes that have disappeared: map[]
INFO: 2018/06/13 17:34:12.671287 ->[172.16.20.13:60523] connection accepted
INFO: 2018/06/13 17:34:12.674712 ->[172.16.20.13:60523|7a:8f:22:1f:0a:17(rizzo)]: connection shutting down  due to error: local "7a:8f:22:1f:0a:17(rizzo)" and remote "7a:8f:22:1f:0a:17(fozzie)" peer names collision

我问,因为我现在第四次从头开始重新安装所有东西,每次我都遇到麻烦从traefik连接到另一台主机上的pod。我责怪网络,因为这看起来不健康。你能告诉我到目前为止设置是否正确。错误是正常的还是我必须关心它们?最后:我如何请求帮助以及我需要提供哪些信息才能让像您这样的人轻松帮助我摆脱这种令人沮丧的境地?

这是我的版本:

Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.2", GitCommit:"81753b10df112992bf51bbc2c2f85208aad78335", GitTreeState:"clean", BuildDate:"2018-04-27T09:22:21Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/arm"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.4", GitCommit:"5ca598b4ba5abb89bb773071ce452e33fb66339d", GitTreeState:"clean", BuildDate:"2018-06-06T08:00:59Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/arm"}

谢谢。

++++ UPDATE ++++ 我重置了机器ID,就像在这里提到的那样:https://github.com/weaveworks/weave/issues/2767但这导致我的机器不断重启!

kernel:[ 2257.674153] Internal error: Oops: 80000007 [#1] SMP ARM

2 个答案:

答案 0 :(得分:1)

最后我在这里找到了解决方案:https://github.com/weaveworks/weave/issues/3314 我们必须禁用fastDP!

答案 1 :(得分:0)

我遇到了同样的问题,禁用fastDP对我来说不起作用,但我发现原因是由于我从同一OS映像克隆了所有节点,因此节点/etc/machine-id的值相同。

我从所有节点上删除了机器ID,并使用以下命令生成了新的机器ID:

sudo rm /etc/machine-id
sudo systemd-machine-id-setup

然后重置我的集群