我试图让kubedns在我的群集上工作,但似乎无法让它运行。它不断重新启动,因为healthz
容器在解析kubernetes.default.svc.cluster.local
时仍然失败。
kubedns
日志的输出:
I0613 17:55:50.490883 1 dns.go:42] version: v1.6.0-alpha.0.680+3872cb93abf948-dirty
I0613 17:55:50.491791 1 server.go:107] Using https://k8s.internal for kubernetes master, kubernetes API: <nil>
I0613 17:55:50.691495 1 server.go:63] ConfigMap not configured, using values from command line flags
I0613 17:55:50.691539 1 server.go:113] FLAG: --alsologtostderr="false"
I0613 17:55:50.691564 1 server.go:113] FLAG: --config-map=""
I0613 17:55:50.691573 1 server.go:113] FLAG: --config-map-namespace="kube-system"
I0613 17:55:50.691586 1 server.go:113] FLAG: --dns-bind-address="0.0.0.0"
I0613 17:55:50.691591 1 server.go:113] FLAG: --dns-port="10053"
I0613 17:55:50.691600 1 server.go:113] FLAG: --domain="cluster.local."
I0613 17:55:50.691612 1 server.go:113] FLAG: --federations=""
I0613 17:55:50.691618 1 server.go:113] FLAG: --healthz-port="8081"
I0613 17:55:50.691622 1 server.go:113] FLAG: --kube-master-url="https://k8s.internal"
I0613 17:55:50.691626 1 server.go:113] FLAG: --kubecfg-file="/etc/kubernetes/worker-kubeconfig.yaml"
I0613 17:55:50.691630 1 server.go:113] FLAG: --log-backtrace-at=":0"
I0613 17:55:50.691636 1 server.go:113] FLAG: --log-dir=""
I0613 17:55:50.691650 1 server.go:113] FLAG: --log-flush-frequency="5s"
I0613 17:55:50.691659 1 server.go:113] FLAG: --logtostderr="true"
I0613 17:55:50.691667 1 server.go:113] FLAG: --stderrthreshold="2"
I0613 17:55:50.691672 1 server.go:113] FLAG: --v="0"
I0613 17:55:50.691677 1 server.go:113] FLAG: --version="false"
I0613 17:55:50.691691 1 server.go:113] FLAG: --vmodule=""
I0613 17:55:50.691767 1 server.go:155] Starting SkyDNS server (0.0.0.0:10053)
I0613 17:55:50.691866 1 server.go:167] Skydns metrics not enabled
I0613 17:55:50.691967 1 logs.go:41] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0]
I0613 17:55:50.691984 1 logs.go:41] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0]
I0613 17:55:50.790263 1 server.go:126] Setting up Healthz Handler (/readiness)
I0613 17:55:50.791070 1 server.go:131] Setting up cache handler (/cache)
I0613 17:55:50.791096 1 server.go:120] Status HTTP port 8081
I0613 17:56:49.437868 1 server.go:150] Ignoring signal terminated (can only be terminated by SIGKILL)
healthz
的输出:
2017/06/13 17:53:09 Healthz probe on /healthz error: Result of last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local': Name does not resolve
, at 2017-06-13 17:53:08.890210169 +0000 UTC, error exit status 1
奇怪的是,如果我exec
对kubedns
容器进行了相同的查找,那么它就成功了:
$ kubectl --namespace kube-system exec kube-dns-v15-8nx4h -c kubedns -- nslookup kubernetes.default.svc.cluster.local localhost
Server: 127.0.0.1
Address 1: 127.0.0.1 localhost
Name: kubernetes.default.svc.cluster.local
Address 1: 172.17.0.1 kubernetes.default.svc.cluster.local
但在dnsmasq
容器上它没有:
$ kubectl --namespace kube-system exec kube-dns-v15-8nx4h -c dnsmasq -- nslookup kubernetes.default.svc.cluster.local localhost
nslookup: can't resolve 'kubernetes.default.svc.cluster.local': Name does not resolve
Server: 127.0.0.1
Address 1: 127.0.0.1 localhost
我也没有看到它进行服务发现以添加记录,但我不确定是否这只是因为它没有时间用健康检查来杀死它它有机会。
以下是复制控制器定义:
{
"apiVersion": "v1",
"kind": "ReplicationController",
"metadata": {
"labels": {
"k8s-app": "kube-dns",
"kubernetes.io/cluster-service": "true",
"version": "v15"
},
"name": "kube-dns-v15",
"namespace": "kube-system"
},
"spec": {
"replicas": 1,
"selector": {
"k8s-app": "kube-dns",
"version": "v15"
},
"template": {
"metadata": {
"labels": {
"k8s-app": "kube-dns",
"kubernetes.io/cluster-service": "true",
"version": "v15"
}
},
"spec": {
"volumes": [
{
"name": "ssl-certs",
"hostPath": {
"path": "/usr/share/ca-certificates"
}
},
{
"name": "kubeconfig",
"hostPath": {
"path": "/etc/kubernetes/worker-kubeconfig.yaml"
}
},
{
"name": "etc-kube-ssl",
"hostPath": {
"path": "/etc/kubernetes/ssl"
}
}
],
"containers": [
{
"volumeMounts": [
{
"mountPath": "/etc/ssl/certs",
"name": "ssl-certs",
"readOnly": true
},
{
"mountPath": "/etc/kubernetes/worker-kubeconfig.yaml",
"name": "kubeconfig",
"readOnly": true
},
{
"mountPath": "/etc/kubernetes/ssl",
"name": "etc-kube-ssl",
"readOnly": true
}
],
"args": [
"--domain=cluster.local.",
"--dns-port=10053",
"--kube-master-url=https://k8s.internal",
"--kubecfg-file=/etc/kubernetes/worker-kubeconfig.yaml"
],
"image": "gcr.io/google_containers/kubedns-amd64:1.9",
"livenessProbe": {
"failureThreshold": 5,
"httpGet": {
"path": "/healthz",
"port": 8080,
"scheme": "HTTP"
},
"initialDelaySeconds": 60,
"successThreshold": 1,
"timeoutSeconds": 5
},
"name": "kubedns",
"ports": [
{
"containerPort": 10053,
"name": "dns-local",
"protocol": "UDP"
},
{
"containerPort": 10053,
"name": "dns-tcp-local",
"protocol": "TCP"
}
],
"readinessProbe": {
"httpGet": {
"path": "/readiness",
"port": 8081,
"scheme": "HTTP"
},
"initialDelaySeconds": 30,
"timeoutSeconds": 5
},
"resources": {
"limits": {
"cpu": "100m",
"memory": "200Mi"
},
"requests": {
"cpu": "100m",
"memory": "50Mi"
}
}
},
{
"args": [
"--cache-size=1000",
"--no-resolv",
"--server=127.0.0.1#10053"
],
"image": "gcr.io/google_containers/kube-dnsmasq-amd64:1.4.1",
"name": "dnsmasq",
"ports": [
{
"containerPort": 53,
"name": "dns",
"protocol": "UDP"
},
{
"containerPort": 53,
"name": "dns-tcp",
"protocol": "TCP"
}
]
},
{
"args": [
"-cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1:10053 >/dev/null",
"-port=8080",
"-quiet"
],
"image": "gcr.io/google_containers/exechealthz-amd64:v1.2.0",
"name": "healthz",
"ports": [
{
"containerPort": 8080,
"protocol": "TCP"
}
],
"resources": {
"limits": {
"cpu": "10m",
"memory": "20Mi"
},
"requests": {
"cpu": "10m",
"memory": "20Mi"
}
}
}
],
"dnsPolicy": "Default"
}
}
}
}