pod中的SkyDNS MissingClusterDNS

时间:2016-06-21 07:44:41

标签: dns kubernetes skydns kubernetes-health-check

我在3台REHEL7服务器上安装了kubernetes 1.2.4(没有互联网访问,一切都是由ansible推动的。)

编辑:请参阅问题的结尾

除了文档中给出的kube-dns示例外,我的所有工作都很棒。我做了几个测试,几个configuraiton,重新创建整个pod ...我总是有这个" MissingClusterDNS"错误:

enter code here20m     20m     2   {kubelet k8s-minion-1.XXXXXX}                   Warning     MissingClusterDNS   kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to DNSDefault policy.

如您所见,kube-dns正在运行:

kubectl get svc kube-dns --namespace=kube-system
NAME       CLUSTER-IP     EXTERNAL-IP   PORT(S)         AGE
kube-dns   172.16.0.99   <none>        53/UDP,53/TCP   15m

kubelete有正确的选择

KUBELET_ARGS=" --cluster-dns=172.16.0.99 --cluster-domain=kubernetes.local "

证明:

ps ax | grep kubelet
 6077 ?        Ssl    0:07 /opt/kubernetes/bin/kubelet --logtostderr=true --v=0 --address=0.0.0.0 --port=10250 --hostname-override=k8s-minion-1.XXXXXX --api-servers=http://k8s-master.XXXXXX:8080 --allow-privileged=false  --cluster-dns=172.16.0.99 --cluster-domain=kubernetes.local

但是,DNS pod作为一个未运行的容器:

kubectl get pods  --namespace=kube-system
NAME                 READY     STATUS             RESTARTS   AGE
kube-dns-v11-f2f4a   3/4       CrashLoopBackOff   7          18m

日志是明确的:

Warning Unhealthy   Readiness probe failed: Get http://172.16.23.2:8081/readiness: dial tcp 172.16.23.2:8081: connection refused
...
Warning FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "kube2sky" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube2sky pod=kube-dns-v11-f2f4a_kube-system(27d70b7c-36f9-11e6-b4fe-fa163ee85c45)"

如果您需要更多信息:

$ kubectl get pods  --namespace=kube-system
NAME                 READY     STATUS             RESTARTS   AGE
kube-dns-v11-f2f4a   3/4       CrashLoopBackOff   7          18m

-------------------------  

$ kubectl describe rc  --namespace=kube-system
Name:       kube-dns-v11
Namespace:  kube-system
Image(s):   our.registry/gcr.io/google_containers/etcd-amd64:2.2.1,our.registry/gcr.io/google_containers/kube2sky:1.14,our.registry/gcr.io/google_containers/skydns:2015-10-13-8c72f8c,our.registry/gcr.io/google_containers/exechealthz:1.0
Selector:   k8s-app=kube-dns,version=v11
Labels:     k8s-app=kube-dns,kubernetes.io/cluster-service=true,version=v11
Replicas:   1 current / 1 desired
Pods Status:    1 Running / 0 Waiting / 0 Succeeded / 0 Failed
Volumes:
  etcd-storage:
    Type:   EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium: 
Events:
  FirstSeen LastSeen    Count   From                SubobjectPath   Type        Reason          Message
  --------- --------    -----   ----                -------------   --------    ------          -------
  19m       19m     1   {replication-controller }           Normal      SuccessfulCreate    Created pod: kube-dns-v11-f2f4a

-------------------------------------------------------

$ kubectl get all --all-namespaces 
NAMESPACE     NAME                 DESIRED        CURRENT            AGE
kube-system   kube-dns-v11         1              1                  24m
NAMESPACE     NAME                 CLUSTER-IP     EXTERNAL-IP        PORT(S)         AGE
default       kubernetes           172.16.0.1     <none>             443/TCP         27m
kube-system   kube-dns             172.16.0.99   <none>             53/UDP,53/TCP   24m
NAMESPACE     NAME                 READY          STATUS             RESTARTS        AGE
default       busybox              1/1            Running            0               23m
kube-system   kube-dns-v11-f2f4a   3/4            CrashLoopBackOff   9               24m

如果有人可以帮我理解这个问题......

注意我使用https://github.com/kubernetes/kubernetes/tree/release-1.2/cluster/addons/dns rc和svc,我只改变了:

  • clusterIp到我的kubernetes ip范围内的有效IP
  • cluster domain:kubernetes.local
  • cluster dns:172.16.0.99

编辑: 3/4 kube-dns工作的问题来自证书。所以我可以确认SkyDNS现在正在运行。

NAME                 READY          STATUS        RESTARTS        AGE
kube-dns-v11-c96d5   4/4            Running       0               9m

使用cluster-api-tester:

kubectl logs --tail=80 kube-dns-v11-c96d5 kube2sky --namespace=kube-system
I0621 13:27:52.070730       1 kube2sky.go:462] Etcd server found: http://127.0.0.1:4001
I0621 13:27:53.073614       1 kube2sky.go:529] Using https://192.168.0.1:443 for kubernetes master
I0621 13:27:53.073632       1 kube2sky.go:530] Using kubernetes API <nil>
I0621 13:27:53.074020       1 kube2sky.go:598] Waiting for service: default/kubernetes
I0621 13:27:53.166188       1 kube2sky.go:660] Successfully added DNS record for Kubernetes service.

但是出现了其他问题。

  • now&#34;使用kubernetes API nil&#34;而不是好的版本
  • 来自kubernetes文档的busybox示例仍然赢得了&resolv kubernetes.local

我会更多调查。但我解决了skydns的问题。感谢

1 个答案:

答案 0 :(得分:2)

您的一个DNS容器尚未准备就绪。那是&#34; Ready 3/4&#34;装置

最好的办法是使用kubectl logs <pod> <container>命令获取失败的容器日志。如果需要从已经失败的容器中获取日志,可以添加kubectl logs --previous ...

希望这能为您提供调试容器未来的原因所必需的信息。