Kubernetes pods无法使用编织进行通信

时间:2016-12-19 04:16:29

标签: networking kubernetes

我有一个简单的集群设置,只有一个主服务器正在运行CoreOS。我使用CoreOS的kubelet-wrapper脚本运行kubelet,我正在为pod网络运行weave。

API服务器,控制器管理器和调度程序都使用systemd单元正确运行,并具有主机网络。

我的问题是,pod无法通过服务IP或Internet IP进行通信。似乎有一个网络接口,路由,默认网关,但总是得到“无路由到主机”。

$ kubectl  run -i --tty busybox --image=busybox --generator="run-pod/v1" --overrides='{"spec": {"template": {"metadata": {"annotations": {"scheduler.alpha.kubernetes.io/tolerations": "[{\"key\":\"dedicated\",\"value\":\"master\",\"effect\":\"NoSchedule\"}]"}}}}}'
Waiting for pod default/busybox to be running, status is Pending, pod ready: false
If you don't see a command prompt, try pressing enter.
/ # route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.32.0.1       0.0.0.0         UG    0      0        0 eth0
10.32.0.0       0.0.0.0         255.240.0.0     U     0      0        0 eth0
/ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
16: eth0@if17: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1410 qdisc noqueue 
    link/ether 62:d0:49:a6:f9:59 brd ff:ff:ff:ff:ff:ff
    inet 10.32.0.7/12 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::60d0:49ff:fea6:f959/64 scope link 
       valid_lft forever preferred_lft forever
/ # ping 10.32.0.6 -c 5
PING 10.32.0.6 (10.32.0.6): 56 data bytes

--- 10.32.0.6 ping statistics ---
5 packets transmitted, 0 packets received, 100% packet loss
/ # wget http://10.32.0.6:80/
Connecting to 10.32.0.6:80 (10.32.0.6:80)
wget: can't connect to remote host (10.32.0.6): No route to host

互联网IP也不起作用:

/ # ping -c 5 172.217.24.132
PING 172.217.24.132 (172.217.24.132): 56 data bytes

--- 172.217.24.132 ping statistics ---
5 packets transmitted, 0 packets received, 100% packet loss
/ # wget http://172.217.24.132/
Connecting to 172.217.24.132 (172.217.24.132:80)
wget: can't connect to remote host (172.217.24.132): No route to host

我的kubelet单位如下:

[Service]
ExecStartPre=/usr/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/usr/bin/mkdir -p /var/log/containers

Environment="KUBELET_KUBECONFIG_ARGS=--kubeconfig=/etc/kubernetes/kubelet.conf --require-kubeconfig=true --hostname-override=192.168.86.50"
Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true"
Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin"
Environment="KUBELET_DNS_ARGS=--cluster-dns=10.3.0.10 --cluster-domain=cluster.local"
Environment="KUBELET_EXTRA_ARGS=--v=4"

Environment=KUBELET_VERSION=v1.4.6_coreos.0
Environment="RKT_OPTS=--volume var-log,kind=host,source=/var/log \
  --mount volume=var-log,target=/var/log \
  --volume dns,kind=host,source=/etc/resolv.conf \
  --mount volume=dns,target=/etc/resolv.conf \
  --volume cni-conf,kind=host,source=/etc/cni \
  --mount volume=cni-conf,target=/etc/cni \
  --volume cni-bin,kind=host,source=/opt/cni \
  --mount volume=cni-bin,target=/opt/cni"

ExecStart=/usr/lib/coreos/kubelet-wrapper \
  $KUBELET_KUBECONFIG_ARGS \
  $KUBELET_SYSTEM_PODS_ARGS \
  $KUBELET_NETWORK_ARGS \
  $KUBELET_DNS_ARGS \
  $KUBELET_EXTRA_ARGS
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target[Service]
ExecStartPre=/usr/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/usr/bin/mkdir -p /var/log/containers

Environment="KUBELET_KUBECONFIG_ARGS=--kubeconfig=/etc/kubernetes/kubelet.conf --require-kubeconfig=true --hostname-override=192.168.86.50"
Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true"
Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin"
Environment="KUBELET_DNS_ARGS=--cluster-dns=10.3.0.10 --cluster-domain=cluster.local"
Environment="KUBELET_EXTRA_ARGS=--v=4"

Environment=KUBELET_VERSION=v1.4.6_coreos.0
Environment="RKT_OPTS=--volume var-log,kind=host,source=/var/log \
  --mount volume=var-log,target=/var/log \
  --volume dns,kind=host,source=/etc/resolv.conf \
  --mount volume=dns,target=/etc/resolv.conf \
  --volume cni-conf,kind=host,source=/etc/cni \
  --mount volume=cni-conf,target=/etc/cni \
  --volume cni-bin,kind=host,source=/opt/cni \
  --mount volume=cni-bin,target=/opt/cni"

ExecStart=/usr/lib/coreos/kubelet-wrapper \
  $KUBELET_KUBECONFIG_ARGS \
  $KUBELET_SYSTEM_PODS_ARGS \
  $KUBELET_NETWORK_ARGS \
  $KUBELET_DNS_ARGS \
  $KUBELET_EXTRA_ARGS
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target

我在群集中运行了编织DaemonSet。

$ kubectl --kubeconfig=ansible/roles/kubernetes-master/admin-user/files/kubeconfig -n kube-system get daemonset
NAME               DESIRED   CURRENT   NODE-SELECTOR   AGE
kube-proxy-amd64   1         1         <none>          22h
weave-net          1         1         <none>          22h

编织日志如下所示:

$ kubectl -n kube-system logs weave-net-me1lz weave
INFO: 2016/12/19 02:19:56.125264 Command line options: map[docker-api: http-addr:127.0.0.1:6784 ipalloc-init:consensus=1 nickname:ia-master1 status-addr:0.0.0.0:6782 datapath:datapath ipalloc-range:10.32.0.0/12 name:52:b1:20:55:0c:fc no-dns:true port:6783]
INFO: 2016/12/19 02:19:56.213194 Communication between peers is unencrypted.
INFO: 2016/12/19 02:19:56.237440 Our name is 52:b1:20:55:0c:fc(ia-master1)
INFO: 2016/12/19 02:19:56.238232 Launch detected - using supplied peer list: [192.168.86.50]
INFO: 2016/12/19 02:19:56.258050 [allocator 52:b1:20:55:0c:fc] Initialising with persisted data
INFO: 2016/12/19 02:19:56.258412 Sniffing traffic on datapath (via ODP)
INFO: 2016/12/19 02:19:56.293898 ->[192.168.86.50:6783] attempting connection
INFO: 2016/12/19 02:19:56.311408 Discovered local MAC 52:b1:20:55:0c:fc
INFO: 2016/12/19 02:19:56.314972 ->[192.168.86.50:47921] connection accepted
INFO: 2016/12/19 02:19:56.370597 ->[192.168.86.50:47921|52:b1:20:55:0c:fc(ia-master1)]: connection shutting down due to error: cannot connect to ourself
INFO: 2016/12/19 02:19:56.381759 Listening for HTTP control messages on 127.0.0.1:6784
INFO: 2016/12/19 02:19:56.391405 ->[192.168.86.50:6783|52:b1:20:55:0c:fc(ia-master1)]: connection shutting down due to error: cannot connect to ourself
INFO: 2016/12/19 02:19:56.423633 Listening for metrics requests on 0.0.0.0:6782
INFO: 2016/12/19 02:19:56.990760 Error checking version: Get https://checkpoint-api.weave.works/v1/check/weave-net?arch=amd64&flag_docker-version=none&flag_kernel-version=4.7.3-coreos-r3&os=linux&signature=1Pty%2FGagYcrEs2TwKnz6IVegmP23z5ifqrP1D9vCzyM%3D&version=1.8.2: x509: failed to load system roots and no roots provided
10.32.0.1
INFO: 2016/12/19 02:19:57.490053 Discovered local MAC 3a:5c:04:54:80:7c
INFO: 2016/12/19 02:19:57.591131 Discovered local MAC c6:1c:f5:43:f0:91
INFO: 2016/12/19 02:34:56.242774 Expired MAC c6:1c:f5:43:f0:91 at 52:b1:20:55:0c:fc(ia-master1)
INFO: 2016/12/19 03:46:29.865157 ->[192.168.86.200:49276] connection accepted
INFO: 2016/12/19 03:46:29.866767 ->[192.168.86.200:49276] connection shutting down due to error during handshake: remote protocol header not recognised: [71 69 84 32 47]
INFO: 2016/12/19 03:46:34.704116 ->[192.168.86.200:49278] connection accepted
INFO: 2016/12/19 03:46:34.754782 ->[192.168.86.200:49278] connection shutting down due to error during handshake: remote protocol header not recognised: [22 3 1 0 242]

编织CNI插件二进制文件似乎是正确创建的。

core@ia-master1 ~ $ ls /opt/cni/bin/
bridge  cnitool  dhcp  flannel  host-local  ipvlan  loopback  macvlan  ptp  tuning  weave-ipam  weave-net  weave-plugin-1.8.2
core@ia-master1 ~ $ ls /etc/cni/net.d/
10-weave.conf
core@ia-master1 ~ $ cat /etc/cni/net.d/10-weave.conf 
{
    "name": "weave",
    "type": "weave-net"
}

Iptables看起来像这样:

core@ia-master1 ~ $ sudo iptables-save
# Generated by iptables-save v1.4.21 on Mon Dec 19 04:15:14 2016
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [2:120]
:POSTROUTING ACCEPT [2:120]
:DOCKER - [0:0]
:KUBE-MARK-DROP - [0:0]
:KUBE-MARK-MASQ - [0:0]
:KUBE-NODEPORTS - [0:0]
:KUBE-POSTROUTING - [0:0]
:KUBE-SEP-AN54BNMS4EGIFEJM - [0:0]
:KUBE-SEP-BQM5WFNH2M6QPJV6 - [0:0]
:KUBE-SERVICES - [0:0]
:KUBE-SVC-ERIFXISQEP7F7OF4 - [0:0]
:KUBE-SVC-NPX46M4PTMTKRN6Y - [0:0]
:KUBE-SVC-NWV5X2332I4OT4T3 - [0:0]
:KUBE-SVC-TCOU7JCQXEZGVUNU - [0:0]
:WEAVE - [0:0]
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
-A POSTROUTING -j WEAVE
-A DOCKER -i docker0 -j RETURN
-A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -m mark --mark 0x4000/0x4000 -j MASQUERADE
-A KUBE-SEP-AN54BNMS4EGIFEJM -s 192.168.86.50/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-AN54BNMS4EGIFEJM -p tcp -m comment --comment "default/kubernetes:https" -m recent --set --name KUBE-SEP-AN54BNMS4EGIFEJM --mask 255.255.255.255 --rsource -m tcp -j DNAT --to-destination 192.168.86.50:443
-A KUBE-SEP-BQM5WFNH2M6QPJV6 -s 10.32.0.6/32 -m comment --comment "default/hostnames:" -j KUBE-MARK-MASQ
-A KUBE-SEP-BQM5WFNH2M6QPJV6 -p tcp -m comment --comment "default/hostnames:" -m tcp -j DNAT --to-destination 10.32.0.6:9376
-A KUBE-SERVICES -d 10.3.0.137/32 -p tcp -m comment --comment "default/hostnames: cluster IP" -m tcp --dport 80 -j KUBE-SVC-NWV5X2332I4OT4T3
-A KUBE-SERVICES -d 10.3.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SERVICES -d 10.3.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-SVC-TCOU7JCQXEZGVUNU
-A KUBE-SERVICES -d 10.3.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-SVC-ERIFXISQEP7F7OF4
-A KUBE-SERVICES -m comment --comment "kubernetes service nodeports; NOTE: this must be the last rule in this chain" -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -m recent --rcheck --seconds 180 --reap --name KUBE-SEP-AN54BNMS4EGIFEJM --mask 255.255.255.255 --rsource -j KUBE-SEP-AN54BNMS4EGIFEJM
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -j KUBE-SEP-AN54BNMS4EGIFEJM
-A KUBE-SVC-NWV5X2332I4OT4T3 -m comment --comment "default/hostnames:" -j KUBE-SEP-BQM5WFNH2M6QPJV6
-A WEAVE -s 10.32.0.0/12 -d 224.0.0.0/4 -j RETURN
-A WEAVE ! -s 10.32.0.0/12 -d 10.32.0.0/12 -j MASQUERADE
-A WEAVE -s 10.32.0.0/12 ! -d 10.32.0.0/12 -j MASQUERADE
COMMIT
# Completed on Mon Dec 19 04:15:14 2016
# Generated by iptables-save v1.4.21 on Mon Dec 19 04:15:14 2016
*filter
:INPUT ACCEPT [73:57513]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [72:61109]
:DOCKER - [0:0]
:DOCKER-ISOLATION - [0:0]
:KUBE-FIREWALL - [0:0]
:KUBE-SERVICES - [0:0]
:WEAVE-NPC - [0:0]
:WEAVE-NPC-DEFAULT - [0:0]
:WEAVE-NPC-INGRESS - [0:0]
-A INPUT -j KUBE-FIREWALL
-A INPUT -d 172.17.0.1/32 -i docker0 -p tcp -m tcp --dport 6783 -j DROP
-A INPUT -d 172.17.0.1/32 -i docker0 -p udp -m udp --dport 6783 -j DROP
-A INPUT -d 172.17.0.1/32 -i docker0 -p udp -m udp --dport 6784 -j DROP
-A INPUT -i docker0 -p udp -m udp --dport 53 -j ACCEPT
-A INPUT -i docker0 -p tcp -m tcp --dport 53 -j ACCEPT
-A FORWARD -i docker0 -o weave -j DROP
-A FORWARD -j DOCKER-ISOLATION
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A FORWARD -o weave -j WEAVE-NPC
-A FORWARD -o weave -m state --state NEW -j NFLOG --nflog-group 86
-A FORWARD -o weave -j DROP
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -j KUBE-FIREWALL
-A DOCKER-ISOLATION -j RETURN
-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
-A KUBE-SERVICES -d 10.3.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns has no endpoints" -m udp --dport 53 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 10.3.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp has no endpoints" -m tcp --dport 53 -j REJECT --reject-with icmp-port-unreachable
-A WEAVE-NPC -m state --state RELATED,ESTABLISHED -j ACCEPT
-A WEAVE-NPC -m state --state NEW -j WEAVE-NPC-DEFAULT
-A WEAVE-NPC -m state --state NEW -j WEAVE-NPC-INGRESS
-A WEAVE-NPC-DEFAULT -m set --match-set weave-k?Z;25^M}|1s7P3|H9i;*;MhG dst -j ACCEPT
-A WEAVE-NPC-DEFAULT -m set --match-set weave-iuZcey(5DeXbzgRFs8Szo]<@p dst -j ACCEPT
COMMIT
# Completed on Mon Dec 19 04:15:14 2016

我做错了什么?

也许相关信息:

  • 192.168.86.50是主节点的IP
  • Pod网络CIDR:10.32.0.0/12(至少这是主节点的用途)
  • 服务CIDR:10.3.0.0/24
  • API服务器群集IP:10.3.0.1

1 个答案:

答案 0 :(得分:2)

我自己也遇到了这个问题,而且我认为它更像是一个CoreOS错误。

CoreOS使用网络守护程序来管理网络接口,但在某些情况下(如此处),您希望自己管理某些接口。而且CNI在CoreOS上被打破了,因为networkd试图管理CNI接口和Weave接口。

相关问题/对话:

我会在/etc/systemd/network/50-weave.network(使用CoreOS alpha)中尝试这样的事情:

[Match]
Name=weave datapath vxlan-* dummy*

[Network]
# I'm not sure if DHCP or IPv6AcceptRA are required here...
DHCP=no
IPv6AcceptRA=false
Unmanaged=yes

这适用于/etc/systemd/network/50-cni.network

[Match]
Name=cni*

[Network]
Unmanaged=yes

然后重新启动,看看它是否有效! (或者你可能想尝试CoreOS稳定,假设你现在正在使用alpha)