Question

仅安装了基本的Kubernetes软件包并使用了minikube之后，我才开始使用基本的kube系统pod。我正在尝试调查为什么kube-dns无法解析域名

这是我正在使用的版本

Client:
 Version:           18.06.1-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        e68fc7a
 Built:             Tue Aug 21 17:24:56 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.1-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       e68fc7a
  Built:            Tue Aug 21 17:23:21 2018
  OS/Arch:          linux/amd64
  Experimental:     false

minikube version: v0.28.2

Kubectl：

Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", BuildDate:"2018-08-07T23:17:28Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-26T16:44:10Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

Kubeadm：

kubeadm version: &version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-26T16:44:10Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

VirtualBox：

Version 5.2.18 r124319 (Qt5.6.2)

这是我已部署的系统Pod：

NAMESPACE     NAME                                    READY     STATUS    RESTARTS   AGE
default       busybox                                 1/1       Running   0          31m
kube-system   etcd-minikube                           1/1       Running   0          32m
kube-system   kube-addon-manager-minikube             1/1       Running   0          33m
kube-system   kube-apiserver-minikube                 1/1       Running   0          33m
kube-system   kube-controller-manager-minikube        1/1       Running   0          33m
kube-system   kube-dns-86f4d74b45-xjfmv               3/3       Running   2          33m
kube-system   kube-proxy-2kkzk                        1/1       Running   0          33m
kube-system   kube-scheduler-minikube                 1/1       Running   0          33m
kube-system   kubernetes-dashboard-5498ccf677-pz87g   1/1       Running   0          33m
kube-system   storage-provisioner                     1/1       Running   0          33m

我还部署了busybox以允许我在容器内执行命令

kubectl exec busybox -- cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local mapleworks.com
options ndots:5

和

kubectl exec busybox nslookup google.com
Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

nslookup: can't resolve 'google.com'
command terminated with exit code 1

在VM本身上运行的相同命令会产生以下结果：

cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 127.0.1.1
search mapleworks.com  <<< OUR local DNS server

nslookup google.com
Server:     127.0.1.1
Address:    127.0.1.1#53

Non-authoritative answer:
Name:   google.com
Address: 172.217.13.174

问题： kube-dns使用默认的名称服务器10.96.0.10，而我希望将VM名称服务器导入到kubernetes中。

虽然在本机Windows或Mac平台上部署的同一个名称服务器能够正确解析域名，但是此VM出现问题。

这是我在其他一些帖子中提到的某种防火墙问题吗？

我已经检查了kube-dns容器日志，但最相关的是来自sidecar容器。

I0910 15:47:17.667100       1 main.go:51] Version v1.14.8
I0910 15:47:17.667195       1 server.go:45] Starting server (options {DnsMasqPort:53 DnsMasqAddr:127.0.0.1 DnsMasqPollIntervalMs:5000 Probes:[{Label:kubedns Server:127.0.0.1:10053 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:33} {Label:dnsmasq Server:127.0.0.1:53 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:33}] PrometheusAddr:0.0.0.0 PrometheusPort:10054 PrometheusPath:/metrics PrometheusNamespace:kubedns})
I0910 15:47:17.667240       1 dnsprobe.go:75] Starting dnsProbe {Label:kubedns Server:127.0.0.1:10053 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:33}
I0910 15:47:17.668244       1 dnsprobe.go:75] Starting dnsProbe {Label:dnsmasq Server:127.0.0.1:53 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:33}
W0910 15:50:04.780281       1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:34535->127.0.0.1:53: i/o timeout
W0910 15:50:11.781236       1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:50887->127.0.0.1:53: i/o timeout
W0910 15:50:24.844065       1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:52865->127.0.0.1:53: i/o timeout
W0910 15:50:31.845587       1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:42053->127.0.0.1:53: i/o timeout

我认为I / O超时与我在google.com上执行的手动DNS查询相对应

否则，我在这里看到本地主机地址和端口53

我只是不知道发生了什么事...

Answer 1

k8s集群中的每个kubelet都具有--cluster-dns选项。实际上，此选项为kube-dns Service提供了Deployment名称。每个kube-dns Pod依次具有dnsmasq容器，该容器使用来自k8s节点的名称服务器列表。您可以在dnsmasq容器的日志中对其进行检查：

I0720 03:49:51.081031       1 nanny.go:116] dnsmasq[13]: reading /etc/resolv.conf
I0720 03:49:51.081068       1 nanny.go:116] dnsmasq[13]: using nameserver 127.0.0.1#10053 for domain ip6.arpa 
I0720 03:49:51.081099       1 nanny.go:116] dnsmasq[13]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa 
I0720 03:49:51.081130       1 nanny.go:116] dnsmasq[13]: using nameserver 127.0.0.1#10053 for domain cluster.local 
I0720 03:49:51.081160       1 nanny.go:116] dnsmasq[13]: using nameserver <nameserver_1>#53
I0720 03:49:51.081190       1 nanny.go:116] dnsmasq[13]: using nameserver <nameserver_2>#53
I0720 03:49:51.081222       1 nanny.go:116] dnsmasq[13]: using nameserver <nameserver_N>#53

在创建任何Pod时，默认情况下，它在nameserver <CLUSTER_DNS_IP>中具有/etc/resolve.conf项。这就是任何Pod可以（或不能）通过kube-dns服务解析某些域名的方式。

例如，我的群集dns是10.233.0.3：

$ kubectl -n test run -it --image=alpine:3.6 alpine -- sh                                                                      
If you don't see a command prompt, try pressing enter.
/ # cat /etc/resolv.conf 
nameserver 10.233.0.3
search test.svc.cluster.local svc.cluster.local cluster.local test.kz
/ # nslookup kubernetes-charts.storage.googleapis.com 10.233.0.3
Server:    10.233.0.3
Address 1: 10.233.0.3 kube-dns.kube-system.svc.cluster.local

Name:      kubernetes-charts.storage.googleapis.com
Address 1: 74.125.131.128 lu-in-f128.1e100.net
Address 2: 2a00:1450:4010:c05::80 li-in-x80.1e100.net

因此，如果Node（安排kube-dns所在的位置）可以解析某些域名，那么任何Pod都可以这样做。

Answer 2

检查ConfigMap服务器中的kube-dns。您是否配置了upstreamNameservers: |？更多信息：https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/

Answer 3

在同事的帮助下，我设法解决了这个问题。事实证明，Ubuntu安装的桌面版本和服务器版本有所不同在服务器上，/etc/network/interface列出了阻止网络/管理器进程运行本地dnsmasq服务运行的主要接口

当我在桌面安装中向该文件添加以下行时：

#Primary Network Interfaces
auto enp0s3
iface enp0s3 inet dhcp

然后，将kube-dnsmasq传递给上游名称服务器地址，然后便能够解析任何DNS请求

以下是更改后运行的网络管理器进程的示例

gilles@gilles-VirtualBox:~$ ps -ef | grep Network
root       870     1  0 16:52 ?        00:00:00 /usr/sbin/NetworkManager --no-daemon
gilles    6991  5316  0 16:55 pts/17   00:00:00 grep --color=auto Network

以下是更改后dnsmasq容器日志的示例：

I0911 20:52:47.878050       1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain ip6.arpa 
I0911 20:52:47.878063       1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa 
I0911 20:52:47.878070       1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain cluster.local 
I0911 20:52:47.878080       1 nanny.go:116] dnsmasq[10]: reading /etc/resolv.conf
I0911 20:52:47.878086       1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain ip6.arpa 
I0911 20:52:47.878092       1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa 
I0911 20:52:47.878097       1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain cluster.local 
I0911 20:52:47.878103       1 nanny.go:116] dnsmasq[10]: using nameserver 172.28.1.3#53
I0911 20:52:47.878109       1 nanny.go:116] dnsmasq[10]: using nameserver 172.28.1.4#53

最后两行仅在更改后出现

然后

kubectl exec busybox -- nslookup google.com
Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name:      google.com
Address 1: 2607:f8b0:4020:804::200e yul02s04-in-x0e.1e100.net
Address 2: 172.217.13.110 yul02s04-in-f14.1e100.net

我希望这对其他人有价值

Answer 4

这似乎是 kube-dns 连接到本地主机 DNS 的问题。您可以手动配置 kube-dns 不使用本地主机 DNS，而是直接访问外部 DNS 服务器。

编辑 CoreDNS 配置：

kubectl -n kube-system edit configmap coredns

换行：

 forward . /etc/resolve.conf {

到：

 forward . 8.8.8.8 {

重启 CoreDNS pod：

kubectl --namespace=kube-system delete pod -l k8s-app=kube-dns

有关详细信息，请参阅帖子 https://runkiss.blogspot.com/2021/01/kubernetes-coredns-external-resolving.html

kube-dns无法解析域名

4 个答案: