Question

我试图让kubedns在我的群集上工作，但似乎无法让它运行。它不断重新启动，因为healthz容器在解析kubernetes.default.svc.cluster.local时仍然失败。

kubedns日志的输出：

I0613 17:55:50.490883       1 dns.go:42] version: v1.6.0-alpha.0.680+3872cb93abf948-dirty
I0613 17:55:50.491791       1 server.go:107] Using https://k8s.internal for kubernetes master, kubernetes API: <nil>
I0613 17:55:50.691495       1 server.go:63] ConfigMap not configured, using values from command line flags
I0613 17:55:50.691539       1 server.go:113] FLAG: --alsologtostderr="false"
I0613 17:55:50.691564       1 server.go:113] FLAG: --config-map=""
I0613 17:55:50.691573       1 server.go:113] FLAG: --config-map-namespace="kube-system"
I0613 17:55:50.691586       1 server.go:113] FLAG: --dns-bind-address="0.0.0.0"
I0613 17:55:50.691591       1 server.go:113] FLAG: --dns-port="10053"
I0613 17:55:50.691600       1 server.go:113] FLAG: --domain="cluster.local."
I0613 17:55:50.691612       1 server.go:113] FLAG: --federations=""
I0613 17:55:50.691618       1 server.go:113] FLAG: --healthz-port="8081"
I0613 17:55:50.691622       1 server.go:113] FLAG: --kube-master-url="https://k8s.internal"
I0613 17:55:50.691626       1 server.go:113] FLAG: --kubecfg-file="/etc/kubernetes/worker-kubeconfig.yaml"
I0613 17:55:50.691630       1 server.go:113] FLAG: --log-backtrace-at=":0"
I0613 17:55:50.691636       1 server.go:113] FLAG: --log-dir=""
I0613 17:55:50.691650       1 server.go:113] FLAG: --log-flush-frequency="5s"
I0613 17:55:50.691659       1 server.go:113] FLAG: --logtostderr="true"
I0613 17:55:50.691667       1 server.go:113] FLAG: --stderrthreshold="2"
I0613 17:55:50.691672       1 server.go:113] FLAG: --v="0"
I0613 17:55:50.691677       1 server.go:113] FLAG: --version="false"
I0613 17:55:50.691691       1 server.go:113] FLAG: --vmodule=""
I0613 17:55:50.691767       1 server.go:155] Starting SkyDNS server (0.0.0.0:10053)
I0613 17:55:50.691866       1 server.go:167] Skydns metrics not enabled
I0613 17:55:50.691967       1 logs.go:41] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0]
I0613 17:55:50.691984       1 logs.go:41] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0]
I0613 17:55:50.790263       1 server.go:126] Setting up Healthz Handler (/readiness)
I0613 17:55:50.791070       1 server.go:131] Setting up cache handler (/cache)
I0613 17:55:50.791096       1 server.go:120] Status HTTP port 8081
I0613 17:56:49.437868       1 server.go:150] Ignoring signal terminated (can only be terminated by SIGKILL)

healthz的输出：

2017/06/13 17:53:09 Healthz probe on /healthz error: Result of last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local': Name does not resolve
, at 2017-06-13 17:53:08.890210169 +0000 UTC, error exit status 1

奇怪的是，如果我exec对kubedns容器进行了相同的查找，那么它就成功了：

$ kubectl --namespace kube-system exec kube-dns-v15-8nx4h -c kubedns -- nslookup kubernetes.default.svc.cluster.local localhost
Server:    127.0.0.1
Address 1: 127.0.0.1 localhost

Name:      kubernetes.default.svc.cluster.local
Address 1: 172.17.0.1 kubernetes.default.svc.cluster.local

但在dnsmasq容器上它没有：

$ kubectl --namespace kube-system exec kube-dns-v15-8nx4h -c dnsmasq -- nslookup kubernetes.default.svc.cluster.local localhost
nslookup: can't resolve 'kubernetes.default.svc.cluster.local': Name does not resolve
Server:    127.0.0.1
Address 1: 127.0.0.1 localhost

我也没有看到它进行服务发现以添加记录，但我不确定是否这只是因为它没有时间用健康检查来杀死它它有机会。

以下是复制控制器定义：

{
  "apiVersion": "v1",
  "kind": "ReplicationController",
  "metadata": {
    "labels": {
      "k8s-app": "kube-dns",
      "kubernetes.io/cluster-service": "true",
      "version": "v15"
    },
    "name": "kube-dns-v15",
    "namespace": "kube-system"
  },
  "spec": {
    "replicas": 1,
    "selector": {
      "k8s-app": "kube-dns",
      "version": "v15"
    },
    "template": {
      "metadata": {
        "labels": {
          "k8s-app": "kube-dns",
          "kubernetes.io/cluster-service": "true",
          "version": "v15"
        }
      },
      "spec": {
        "volumes": [
          {
            "name": "ssl-certs",
            "hostPath": {
              "path": "/usr/share/ca-certificates"
            }
          },
          {
            "name": "kubeconfig",
            "hostPath": {
              "path": "/etc/kubernetes/worker-kubeconfig.yaml"
            }
          },
          {
            "name": "etc-kube-ssl",
            "hostPath": {
              "path": "/etc/kubernetes/ssl"
            }
          }
        ],
        "containers": [
          {
            "volumeMounts": [
              {
                "mountPath": "/etc/ssl/certs",
                "name": "ssl-certs",
                "readOnly": true
              },
              {
                "mountPath": "/etc/kubernetes/worker-kubeconfig.yaml",
                "name": "kubeconfig",
                "readOnly": true
              },
              {
                "mountPath": "/etc/kubernetes/ssl",
                "name": "etc-kube-ssl",
                "readOnly": true
              }
            ],
            "args": [
              "--domain=cluster.local.",
              "--dns-port=10053",
              "--kube-master-url=https://k8s.internal",
              "--kubecfg-file=/etc/kubernetes/worker-kubeconfig.yaml"
            ],
            "image": "gcr.io/google_containers/kubedns-amd64:1.9",
            "livenessProbe": {
              "failureThreshold": 5,
              "httpGet": {
                "path": "/healthz",
                "port": 8080,
                "scheme": "HTTP"
              },
              "initialDelaySeconds": 60,
              "successThreshold": 1,
              "timeoutSeconds": 5
            },
            "name": "kubedns",
            "ports": [
              {
                "containerPort": 10053,
                "name": "dns-local",
                "protocol": "UDP"
              },
              {
                "containerPort": 10053,
                "name": "dns-tcp-local",
                "protocol": "TCP"
              }
            ],
            "readinessProbe": {
              "httpGet": {
                "path": "/readiness",
                "port": 8081,
                "scheme": "HTTP"
              },
              "initialDelaySeconds": 30,
              "timeoutSeconds": 5
            },
            "resources": {
              "limits": {
                "cpu": "100m",
                "memory": "200Mi"
              },
              "requests": {
                "cpu": "100m",
                "memory": "50Mi"
              }
            }
          },
          {
            "args": [
              "--cache-size=1000",
              "--no-resolv",
              "--server=127.0.0.1#10053"
            ],
            "image": "gcr.io/google_containers/kube-dnsmasq-amd64:1.4.1",
            "name": "dnsmasq",
            "ports": [
              {
                "containerPort": 53,
                "name": "dns",
                "protocol": "UDP"
              },
              {
                "containerPort": 53,
                "name": "dns-tcp",
                "protocol": "TCP"
              }
            ]
          },
          {
            "args": [
              "-cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1:10053 >/dev/null",
              "-port=8080",
              "-quiet"
            ],
            "image": "gcr.io/google_containers/exechealthz-amd64:v1.2.0",
            "name": "healthz",
            "ports": [
              {
                "containerPort": 8080,
                "protocol": "TCP"
              }
            ],
            "resources": {
              "limits": {
                "cpu": "10m",
                "memory": "20Mi"
              },
              "requests": {
                "cpu": "10m",
                "memory": "20Mi"
              }
            }
          }
        ],
        "dnsPolicy": "Default"
      }
    }
  }
}

kubedns容器无休止地重启

0 个答案: