破碎：

kc exec -it -n  iotest app1-b67598997-p9lqk  -c userapp sh

/app $ nslookup www.google.com
nslookup: can't resolve '(null)': Name does not resolve

/app $ cat /etc/resolv.conf
nameserver 10.63.240.10
search iotest.svc.cluster.local svc.cluster.local cluster.local c.myproj.internal google.internal
options ndots:5

/app $ curl -I 10.63.240.10
curl: (7) Failed to connect to 10.63.240.10 port 80: Connection refused

/app $ netstat -antp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.1:8001          0.0.0.0:*               LISTEN      1/python
tcp        0      0 ::1:50051               :::*                    LISTEN      1/python
tcp        0      0 ::ffff:127.0.0.1:50051  :::*                    LISTEN      1/python

工作：

kc exec -it -n  iotest app1-7d985bfd7b-h5dbr -c userapp sh

/app $ nslookup www.google.com
nslookup: can't resolve '(null)': Name does not resolve

Name:      www.google.com
Address 1: 74.125.206.147 wk-in-f147.1e100.net
Address 2: 74.125.206.105 wk-in-f105.1e100.net
Address 3: 74.125.206.99 wk-in-f99.1e100.net
Address 4: 74.125.206.104 wk-in-f104.1e100.net
Address 5: 74.125.206.106 wk-in-f106.1e100.net
Address 6: 74.125.206.103 wk-in-f103.1e100.net
Address 7: 2a00:1450:400c:c04::68 wk-in-x68.1e100.net

/app $ cat /etc/resolv.conf
nameserver 10.63.240.10
search iotest.svc.cluster.local svc.cluster.local cluster.local c.myproj.internal google.internal
options ndots:5

/app $ curl -I 10.63.240.10
HTTP/1.1 404 Not Found
date: Sun, 29 Jul 2018 15:13:47 GMT
server: envoy
content-length: 0

/app $ netstat -antp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.1:15000         0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:15001           0.0.0.0:*               LISTEN      -
tcp        0      0 127.0.0.1:8001          0.0.0.0:*               LISTEN      1/python
tcp        0      0 10.60.2.6:56508         10.60.48.22:9091        ESTABLISHED -
tcp        0      0 127.0.0.1:57768         127.0.0.1:50051         ESTABLISHED -
tcp        0      0 10.60.2.6:43334         10.63.255.44:15011      ESTABLISHED -
tcp        0      0 10.60.2.6:15001         10.60.45.26:57160       ESTABLISHED -
tcp        0      0 10.60.2.6:48946         10.60.45.28:9091        ESTABLISHED -
tcp        0      0 127.0.0.1:49804         127.0.0.1:50051         ESTABLISHED -
tcp        0      0 ::1:50051               :::*                    LISTEN      1/python
tcp        0      0 ::ffff:127.0.0.1:50051  :::*                    LISTEN      1/python
tcp        0      0 ::ffff:127.0.0.1:50051  ::ffff:127.0.0.1:49804  ESTABLISHED 1/python
tcp        0      0 ::ffff:127.0.0.1:50051  ::ffff:127.0.0.1:57768  ESTABLISHED 1/python

这些吊舱是相同的，只是重新启动了一个。

有人对如何分析和解决此问题有建议吗？

Answer 1

尝试一些步骤：

1）ifconfig eth0或任何主要接口。接口启动了吗？发送和接收数据包的数量在增加吗？

2）如果接口打开，则可以在运行发布的nslookup命令时尝试tcpdump。查看dns请求数据包是否已发送出去。

3）当网络连接中断时，请查看Pod安排在哪个节点上。也许每次都在同一节点上？如果是，该节点上的其他Pod是否也遇到类似的问题？

Answer 2

我也遇到了同样的问题，我现在只是通过切换到1.9.x GKE版本（在花费大量时间尝试调试我的应用无法运行的原因之后）来解决该问题。

希望这会有所帮助！

GKE 1.10 kubernetes群集上的网络连接/ DNS问题

破碎：

工作：

2 个答案: