无法连接到Kubernetes Pod内部的外部位置-DNS问题

时间:2020-07-23 11:19:09

标签: kubernetes dns systemd kubernetes-pod

我有以下问题。我有一个命名空间“ qa”。此名称空间中的Pod可以相互通信。

例如

kubectl exec -it qa-file-watcher-85575bd8f7-npkns -n qa /bin/bash
root@qa-file-watcher-85575bd8f7-npkns:/usr/src/app# nslookup qa-kafka-broker

root@qa-file-watcher-85575bd8f7-npkns:/usr/src/app# nslookup qa-kafka-broker
Server:         10.96.0.10
Address:        10.96.0.10#53

Name:   qa-kafka-broker.qa.svc.cluster.local
Address: 10.102.218.167

但是,如果我尝试连接到外部服务,例如8.8.8.8 oder security.debian.org进行apt-get更新,出现以下错误

root@qa-file-watcher-85575bd8f7-npkns:/usr/src/app# nslookup 8.8.8.8
Server:         10.96.0.10
Address:        10.96.0.10#53

** server can't find 8.8.8.8.in-addr.arpa: SERVFAIL

root@qa-file-watcher-85575bd8f7-npkns:/usr/src/app# nslookup security.debian.org
Server:         10.96.0.10
Address:        10.96.0.10#53

** server can't find security.debian.org.eu-central-1.compute.internal: SERVFAIL

以下是有关设置的一些信息。我在AWS的EC2实例上使用了bitnami / kubernetes映像

bitnami@ip-172-30-0-120:~/buildAgent/work/aad99852b1e5781f$ kubectl version
Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3", GitCommit:"06ad960bfd03b39c8310aaf92d1e7c12ce618213", GitTreeState:"clean", BuildDate:"2020-02-11T18:14:22Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3", GitCommit:"06ad960bfd03b39c8310aaf92d1e7c12ce618213", GitTreeState:"clean", BuildDate:"2020-02-11T18:07:13Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}

bitnami@ip-172-30-0-120:~/buildAgent/work/aad99852b1e5781f$ cat /etc/os-release
NAME="Ubuntu"
VERSION="16.04.6 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.6 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial

bitnami@ip-172-30-0-120:~/buildAgent/work/aad99852b1e5781f$ cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 172.30.0.2
search xxxxxxxx.compute.internal default.svc.cluster.local svc.cluster.local cluster.local deb.debian.org
options ndots:5 single request-reopen

DNS=8.8.8.8

使用以下配置在kubernetes上运行coredns

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
data:
  Corefile: |
    .:53 {
        log
        errors
        health {
           lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           fallthrough in-addr.arpa ip6.arpa
           ttl 30
        }
        prometheus :9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
    }
kind: ConfigMap
metadata:
  creationTimestamp: "2020-02-25T12:52:17Z"
  name: coredns
  namespace: kube-system
  resourceVersion: "31099780"
  selfLink: /api/v1/namespaces/kube-system/configmaps/coredns
  uid: 26a6800a-2ceb-4f29-ab85-82beaec0add8

任何人都知道这里出了什么问题?如果需要更详细的信息,请告诉我。

问候和感谢

编辑: 这是在命名空间kube-system上运行的pods

bitnami@ip-172-30-0-120:~/deployments/qa-deployment$ kubectl get pods -n kube-system
NAME                                          READY   STATUS    RESTARTS   AGE
coredns-6955765f44-5glwz                      1/1     Running   0          151m
coredns-6955765f44-hf2hd                      1/1     Running   0          151m
etcd-ip-172-30-0-120                          1/1     Running   4          9d
heapster-744b794df7-v2vz9                     1/1     Running   1          9d
kube-apiserver-ip-172-30-0-120                1/1     Running   4          9d
kube-controller-manager-ip-172-30-0-120       1/1     Running   7          9d
kube-proxy-lfstn                              1/1     Running   1          9d
kube-scheduler-ip-172-30-0-120                1/1     Running   6          9d
kubernetes-dashboard-8f7798644-m7r8x          1/1     Running   13         9d
kubernetes-metrics-scraper-6b97c6d857-nl98d   1/1     Running   0          8d
local-volume-provisioner-69vrv                1/1     Running   33         9d
monitoring-grafana-845bc8df5f-62d4x           1/1     Running   1          9d
monitoring-influxdb-56d9446bd9-wlrd5          1/1     Running   1          9d
nginx-ingress-controller-574d4c9dcf-fmdgm     1/1     Running   1          9d
registry-86c45b9d9b-pm6zj                     1/1     Running   0          7d23h
weave-net-g78mj                               2/2     Running   5          9d

这是来自核心dns的日志

...
...
...

[INFO] 10.32.0.35:49254 - 6294 "AAAA IN monitoring.xxxxxx.de.qa.svc.cluster.local. udp 66 false 512" NXDOMAIN qr,aa,rd 159 0.000297909s
[INFO] 10.32.0.35:55396 - 52809 "A IN monitoring.xxxxxx.de.svc.cluster.local. udp 63 false 512" NXDOMAIN qr,aa,rd 156 0.000152558s
[INFO] 10.32.0.35:55396 - 36432 "AAAA IN monitoring.xxxxxx.de.svc.cluster.local. udp 63 false 512" NXDOMAIN qr,aa,rd 156 0.000384192s
[INFO] 10.32.0.31:54436 - 61896 "AAAA IN xxxxxx.cq5rq6zjwmfc.eu-central-1.rds.amazonaws.com. udp 74 false 512" NOERROR - 0 2.000274796s
[ERROR] plugin/errors: 2 xxxxxx.cq5rq6zjwmfc.eu-central-1.rds.amazonaws.com. AAAA: read udp 10.32.0.30:41402->172.30.0.2:53: i/o timeout
[INFO] 10.32.0.31:54436 - 64312 "A IN xxxxxx.cq5rq6zjwmfc.eu-central-1.rds.amazonaws.com. udp 74 false 512" NOERROR - 0 2.000270418s
[ERROR] plugin/errors: 2 xxxxxx.cq5rq6zjwmfc.eu-central-1.rds.amazonaws.com. A: read udp 10.32.0.30:43606->172.30.0.2:53: i/o timeout
[INFO] 10.32.0.31:54436 - 8384 "AAAA IN postgres.qa.svc.cluster.local. udp 47 false 512" NOERROR qr,aa,rd 146 2.000560668s
[INFO] 10.32.0.31:54436 - 60087 "A IN postgres.qa.svc.cluster.local. udp 47 false 512" NOERROR qr,aa,rd 146 2.000566155s

EDIT2:

我不能用以下方式进入coredns吊舱

bitnami@ip-172-30-0-120:~/deployments/qa-deployment$ kubectl exec -it coredns-6955765f44-5glwz -n kube-system bash
error: Internal error occurred: error executing command in container: failed to exec in container: failed to start exec "2a604d5b8cfad5341acc0d548412f8376fdf063bf97d92d1aaa501841f959671": OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "exec: \"bash\": executable file not found in $PATH": unknown

命名空间qa中的pod file-watcher-service内部的resolve.conf:

root@qa-file-watcher-service-7b7d47c67d-fjb8m:/etc# cat resolv.conf
search qa.svc.cluster.local svc.cluster.local cluster.local eu-central-1.compute.internal default.svc.cluster.local
nameserver 10.96.0.10
options ndots:5

0 个答案:

没有答案