我尝试使用kubeadm升级到1.7到1.9,kube-dns是crashloopig。我删除了部署并使用最新的yaml for kube-dns应用了新部署(将clusterip替换为10.96.0.10,将域替换为cluster.local)。
kubedns容器在无法从api服务器获得有效响应后失败。 10.96.0.1 ip确实响应集群中所有服务器的443端口上的wget(403禁止响应)。
E0104 21:51:42.732805 1 reflector.go:201] k8s.io/dns/pkg/dns/dns.go:147: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E0104 21:51:42.732971 1 reflector.go:201] k8s.io/dns/pkg/dns/dns.go:150: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
这是导致日志错误的连接问题,配置问题或安全模型更改吗?
感谢。
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ubuntu80 Ready master 165d v1.9.1
ubuntu81 Ready <none> 165d v1.9.1
ubuntu82 Ready <none> 165d v1.9.1
ubuntu83 Ready <none> 163d v1.9.1
$ kubectl get all --namespace=kube-system
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
ds/kube-flannel-ds 4 4 4 0 4 beta.kubernetes.io/arch=amd64 165d
ds/kube-proxy 4 4 4 4 4 <none> 165d
ds/traefik-ingress-controller 3 3 3 3 3 <none> 165d
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deploy/kube-dns 1 1 1 0 1h
deploy/tiller-deploy 1 1 1 1 163d
NAME DESIRED CURRENT READY AGE
rs/kube-dns-6c857864fb 1 1 0 1h
rs/tiller-deploy-3341511835 1 1 1 105d
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
ds/kube-flannel-ds 4 4 4 0 4 beta.kubernetes.io/arch=amd64 165d
ds/kube-proxy 4 4 4 4 4 <none> 165d
ds/traefik-ingress-controller 3 3 3 3 3 <none> 165d
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deploy/kube-dns 1 1 1 0 1h
deploy/tiller-deploy 1 1 1 1 163d
NAME DESIRED CURRENT READY AGE
rs/kube-dns-6c857864fb 1 1 0 1h
rs/tiller-deploy-3341511835 1 1 1 105d
NAME READY STATUS RESTARTS AGE
po/etcd-ubuntu80 1/1 Running 1 16d
po/kube-apiserver-ubuntu80 1/1 Running 1 2h
po/kube-controller-manager-ubuntu80 1/1 Running 1 2h
po/kube-dns-6c857864fb-grhxp 1/3 CrashLoopBackOff 52 1h
po/kube-flannel-ds-07npj 2/2 Running 32 165d
po/kube-flannel-ds-169lh 2/2 Running 26 165d
po/kube-flannel-ds-50c56 2/2 Running 27 163d
po/kube-flannel-ds-wkd7j 2/2 Running 29 165d
po/kube-proxy-495n7 1/1 Running 1 2h
po/kube-proxy-9g7d2 1/1 Running 1 2h
po/kube-proxy-d856z 1/1 Running 0 2h
po/kube-proxy-kzmcc 1/1 Running 0 2h
po/kube-scheduler-ubuntu80 1/1 Running 1 2h
po/tiller-deploy-3341511835-m3x26 1/1 Running 2 58d
po/traefik-ingress-controller-51r7d 1/1 Running 4 105d
po/traefik-ingress-controller-sf6lc 1/1 Running 4 105d
po/traefik-ingress-controller-xz1rt 1/1 Running 5 105d
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
svc/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 1h
svc/kubernetes-dashboard ClusterIP 10.101.112.198 <none> 443/TCP 165d
svc/tiller-deploy ClusterIP 10.98.117.242 <none> 44134/TCP 163d
svc/traefik-web-ui ClusterIP 10.110.215.194 <none> 80/TCP 165d
$ kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c kubedns
I0104 21:51:12.730927 1 dns.go:48] version: 1.14.6-3-gc36cb11
I0104 21:51:12.731643 1 server.go:69] Using configuration read from directory: /kube-dns-config with period 10s
I0104 21:51:12.731673 1 server.go:112] FLAG: --alsologtostderr="false"
I0104 21:51:12.731679 1 server.go:112] FLAG: --config-dir="/kube-dns-config"
I0104 21:51:12.731683 1 server.go:112] FLAG: --config-map=""
I0104 21:51:12.731686 1 server.go:112] FLAG: --config-map-namespace="kube-system"
I0104 21:51:12.731688 1 server.go:112] FLAG: --config-period="10s"
I0104 21:51:12.731693 1 server.go:112] FLAG: --dns-bind-address="0.0.0.0"
I0104 21:51:12.731695 1 server.go:112] FLAG: --dns-port="10053"
I0104 21:51:12.731713 1 server.go:112] FLAG: --domain="cluster.local."
I0104 21:51:12.731717 1 server.go:112] FLAG: --federations=""
I0104 21:51:12.731723 1 server.go:112] FLAG: --healthz-port="8081"
I0104 21:51:12.731726 1 server.go:112] FLAG: --initial-sync-timeout="1m0s"
I0104 21:51:12.731729 1 server.go:112] FLAG: --kube-master-url=""
I0104 21:51:12.731733 1 server.go:112] FLAG: --kubecfg-file=""
I0104 21:51:12.731735 1 server.go:112] FLAG: --log-backtrace-at=":0"
I0104 21:51:12.731740 1 server.go:112] FLAG: --log-dir=""
I0104 21:51:12.731743 1 server.go:112] FLAG: --log-flush-frequency="5s"
I0104 21:51:12.731746 1 server.go:112] FLAG: --logtostderr="true"
I0104 21:51:12.731748 1 server.go:112] FLAG: --nameservers=""
I0104 21:51:12.731751 1 server.go:112] FLAG: --stderrthreshold="2"
I0104 21:51:12.731753 1 server.go:112] FLAG: --v="2"
I0104 21:51:12.731756 1 server.go:112] FLAG: --version="false"
I0104 21:51:12.731761 1 server.go:112] FLAG: --vmodule=""
I0104 21:51:12.731798 1 server.go:194] Starting SkyDNS server (0.0.0.0:10053)
I0104 21:51:12.731979 1 server.go:213] Skydns metrics enabled (/metrics:10055)
I0104 21:51:12.731987 1 dns.go:146] Starting endpointsController
I0104 21:51:12.731991 1 dns.go:149] Starting serviceController
I0104 21:51:12.732457 1 logs.go:41] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0]
I0104 21:51:12.732467 1 logs.go:41] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0]
I0104 21:51:13.232355 1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0104 21:51:13.732395 1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0104 21:51:14.232389 1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0104 21:51:14.732389 1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0104 21:51:15.232369 1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0104 21:51:42.732629 1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
E0104 21:51:42.732805 1 reflector.go:201] k8s.io/dns/pkg/dns/dns.go:147: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E0104 21:51:42.732971 1 reflector.go:201] k8s.io/dns/pkg/dns/dns.go:150: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
I0104 21:51:43.232257 1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0104 21:51:51.232379 1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0104 21:51:51.732371 1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0104 21:51:52.232390 1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0104 21:52:11.732376 1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0104 21:52:12.232382 1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
F0104 21:52:12.732377 1 dns.go:167] Timeout waiting for initialization
$ kubectl describe po/kube-dns-6c857864fb-grhxp --namespace=kube-system
Name: kube-dns-6c857864fb-grhxp
Namespace: kube-system
Node: ubuntu82/10.80.82.1
Start Time: Fri, 05 Jan 2018 01:55:48 +0530
Labels: k8s-app=kube-dns
pod-template-hash=2741342096
Annotations: scheduler.alpha.kubernetes.io/critical-pod=
Status: Running
IP: 10.244.2.12
Controlled By: ReplicaSet/kube-dns-6c857864fb
Containers:
kubedns:
Container ID: docker://3daa4233f54fa251abdcdfe73d2e71179356f5da45983d19fe66a3f18bab8d13
Image: gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.7
Image ID: docker-pullable://gcr.io/google_containers/k8s-dns-kube-dns-amd64@sha256:f5bddc71efe905f4e4b96f3ca346414be6d733610c1525b98fff808f93966680
Ports: 10053/UDP, 10053/TCP, 10055/TCP
Args:
--domain=cluster.local.
--dns-port=10053
--config-dir=/kube-dns-config
--v=2
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 255
Started: Fri, 05 Jan 2018 03:21:12 +0530
Finished: Fri, 05 Jan 2018 03:22:12 +0530
Ready: False
Restart Count: 26
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:10054/healthcheck/kubedns delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8081/readiness delay=3s timeout=5s period=10s #success=1 #failure=3
Environment:
PROMETHEUS_PORT: 10055
Mounts:
/kube-dns-config from kube-dns-config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-dns-token-cpzzw (ro)
dnsmasq:
Container ID: docker://a40a34e6fdf7176ea148fdb1f21d157c5d264e44bd14183ed9d19164a742fb65
Image: gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.7
Image ID: docker-pullable://gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64@sha256:6cfb9f9c2756979013dbd3074e852c2d8ac99652570c5d17d152e0c0eb3321d6
Ports: 53/UDP, 53/TCP
Args:
-v=2
-logtostderr
-configDir=/etc/k8s/dns/dnsmasq-nanny
-restartDnsmasq=true
--
-k
--cache-size=1000
--no-negcache
--log-facility=-
--server=/cluster.local/127.0.0.1#10053
--server=/in-addr.arpa/127.0.0.1#10053
--server=/ip6.arpa/127.0.0.1#10053
State: Running
Started: Fri, 05 Jan 2018 03:24:44 +0530
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Fri, 05 Jan 2018 03:17:33 +0530
Finished: Fri, 05 Jan 2018 03:19:33 +0530
Ready: True
Restart Count: 27
Requests:
cpu: 150m
memory: 20Mi
Liveness: http-get http://:10054/healthcheck/dnsmasq delay=60s timeout=5s period=10s #success=1 #failure=5
Environment: <none>
Mounts:
/etc/k8s/dns/dnsmasq-nanny from kube-dns-config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-dns-token-cpzzw (ro)
sidecar:
Container ID: docker://c05b33a08344f15b0d1a1e8fee39cc05b6d9de6a24db6d2cd05e92c2706fc03c
Image: gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.7
Image ID: docker-pullable://gcr.io/google_containers/k8s-dns-sidecar-amd64@sha256:f80f5f9328107dc516d67f7b70054354b9367d31d4946a3bffd3383d83d7efe8
Port: 10054/TCP
Args:
--v=2
--logtostderr
--probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local,5,SRV
--probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local,5,SRV
State: Running
Started: Fri, 05 Jan 2018 02:09:25 +0530
Last State: Terminated
Reason: Error
Exit Code: 2
Started: Fri, 05 Jan 2018 01:55:50 +0530
Finished: Fri, 05 Jan 2018 02:08:20 +0530
Ready: True
Restart Count: 1
Requests:
cpu: 10m
memory: 20Mi
Liveness: http-get http://:10054/metrics delay=60s timeout=5s period=10s #success=1 #failure=5
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-dns-token-cpzzw (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
kube-dns-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: kube-dns
Optional: true
kube-dns-token-cpzzw:
Type: Secret (a volume populated by a Secret)
SecretName: kube-dns-token-cpzzw
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: CriticalAddonsOnly
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 46m (x57 over 1h) kubelet, ubuntu82 Readiness probe failed: Get http://10.244.2.12:8081/readiness: dial tcp 10.244.2.12:8081: getsockopt: connection refused
Warning Unhealthy 36m (x42 over 1h) kubelet, ubuntu82 Liveness probe failed: HTTP probe failed with statuscode: 503
Warning BackOff 31m (x162 over 1h) kubelet, ubuntu82 Back-off restarting failed container
Normal Killing 26m (x13 over 1h) kubelet, ubuntu82 Killing container with id docker://dnsmasq:Container failed liveness probe.. Container will be killed and recreated.
Normal SuccessfulMountVolume 21m kubelet, ubuntu82 MountVolume.SetUp succeeded for volume "kube-dns-token-cpzzw"
Normal SuccessfulMountVolume 21m kubelet, ubuntu82 MountVolume.SetUp succeeded for volume "kube-dns-config"
Normal Pulled 21m kubelet, ubuntu82 Container image "gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.7" already present on machine
Normal Started 21m kubelet, ubuntu82 Started container
Normal Created 21m kubelet, ubuntu82 Created container
Normal Started 19m (x2 over 21m) kubelet, ubuntu82 Started container
Normal Created 19m (x2 over 21m) kubelet, ubuntu82 Created container
Normal Pulled 19m (x2 over 21m) kubelet, ubuntu82 Container image "gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.7" already present on machine
Warning Unhealthy 19m (x4 over 20m) kubelet, ubuntu82 Liveness probe failed: HTTP probe failed with statuscode: 503
Warning Unhealthy 16m (x22 over 21m) kubelet, ubuntu82 Readiness probe failed: Get http://10.244.2.12:8081/readiness: dial tcp 10.244.2.12:8081: getsockopt: connection refused
Normal Killing 6m (x6 over 19m) kubelet, ubuntu82 Killing container with id docker://dnsmasq:Container failed liveness probe.. Container will be killed and recreated.
Warning BackOff 1m (x65 over 20m) kubelet, ubuntu82 Back-off restarting failed container
答案 0 :(得分:2)
Kubedns 1.14.7与kubernetes 1.9.1不兼容。在我的情况下,kubedns试图使用443连接到apiserver,而不是按照配置连接到6443.
当我将图像版本更改为1.14.8(最新 - kubedns github)时,kubedns正确识别了apiserver端口。没问题了:
kubectl edit deploy kube-dns --namespace=kube-system
#change to the image version to 1.14.8 and works
答案 1 :(得分:0)
是的,我也看到了kube-dns
1.14.7
的问题。通过执行以下操作,在https://github.com/kubernetes/dns/releases中使用最新的kube-dns
版1.14.8
kubectl edit deploy kube-dns --namespace=kube-system
# change the image version in the "Image:" field to 1.14.8
如果仍然可以看到问题,也可以:
kubectl create configmap --namespace=kube-system kube-dns
kubectl delete pod <name of kube-dns pod> --namespace=kube-system
# kube-dns should restart and work now