如何从我的kubernetes docker容器中获取更多调试信息

时间:2016-07-15 13:57:07

标签: kubernetes coreos

我真的很难让kubernetes在CoreOS上工作,我最大的问题是找到正确的信息来调试以下问题。

(我按照本教程https://coreos.com/kubernetes/docs/latest/getting-started.html

```

core@helena-coreos ~ $ docker ps 
CONTAINER ID        IMAGE                                      COMMAND                  CREATED             STATUS              PORTS               NAMES
423aa16a66e7        gcr.io/google_containers/pause:2.0         "/pause"                 2 hours ago         Up 2 hours                              k8s_POD.6059dfa2_kube-controller-manager-37.139.31.151_kube-system_f11896dcf9adf655df092f2a12a41673_ec25db0a
4b456d7cf17d        quay.io/coreos/hyperkube:v1.2.4_coreos.1   "/hyperkube apiserver"   3 hours ago         Up 3 hours                              k8s_kube-apiserver.33667886_kube-apiserver-37.139.31.151_kube-system_bfdfe85e7787a05e49ebfe95e7d4a401_abd7982f
52e25d838af3        gcr.io/google_containers/pause:2.0         "/pause"                 3 hours ago         Up 3 hours                              k8s_POD.6059dfa2_kube-apiserver-37.139.31.151_kube-system_bfdfe85e7787a05e49ebfe95e7d4a401_411b1a93

显然,kube-controller-manager存在问题。调试信息的两个主要来源(据我所知)是日志和泊坞日志。

docker日志根本不显示任何内容。 ```

core@helena-coreos ~ $ docker logs 423aa16a66e7
core@helena-coreos ~ $  

我还尝试使用docker exec登录此容器,但这也无效。

所以我的希望是基于期刊。

```

Jul 15 13:16:59 helena-coreos kubelet-wrapper[2318]: I0715 13:16:59.143892    2318 manager.go:2050] Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-37.139.31.151_kube-system(f11896dcf9adf655df092f2a12a41673)
Jul 15 13:16:59 helena-coreos kubelet-wrapper[2318]: E0715 13:16:59.143992    2318 pod_workers.go:138] Error syncing pod f11896dcf9adf655df092f2a12a41673, skipping: failed to "StartContainer" for "kube-controller-manager" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-37.139.31.151_kube-system(f11896dcf9adf655df092f2a12a41673)"

所以kube-controller-manager无法启动,但我不知道为什么。我该如何解决这个问题?

我转发了以下配置:

```

core@helena-coreos ~ $ cat /etc/flannel/options.env
FLANNELD_IFACE=37.139.31.151
FLANNELD_ETCD_ENDPOINTS=https://37.139.31.151:2379
FLANNELD_ETCD_CAFILE=/etc/ssl/etcd/ca.pem
FLANNELD_ETCD_CERTFILE=/etc/ssl/etcd/coreos.pem
FLANNELD_ETCD_KEYFILE=/etc/ssl/etcd/coreos-key.pem

```

core@helena-coreos ~ $ cat /etc/systemd/system/flanneld.service.d/40-ExecStartPre-symlink.conf
[Service]
ExecStartPre=/usr/bin/ln -sf /etc/flannel/options.env /run/flannel/options.env

```

core@helena-coreos ~ $ cat /etc/systemd/system/docker.service.d/35-flannel.conf 
[Unit]
Requires=flanneld.service
After=flanneld.service

```

core@helena-coreos ~ $ cat /etc/systemd/system/kubelet.service
[Service]
ExecStartPre=/usr/bin/mkdir -p /etc/kubernetes/manifests

Environment=KUBELET_VERSION=v1.2.4_coreos.cni.1
ExecStart=/usr/lib/coreos/kubelet-wrapper \
--api-servers=http://127.0.0.1:8080 \
--network-plugin-dir=/etc/kubernetes/cni/net.d \
--network-plugin= \
--register-schedulable=false \
--allow-privileged=true \
--config=/etc/kubernetes/manifests \
--hostname-override=37.139.31.151 \
--cluster-dns=10.3.0.10 \
--cluster-domain=cluster.local
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target

```

core@helena-coreos ~ $ cat /etc/kubernetes/manifests/kube-apiserver.yaml
apiVersion: v1
kind: Pod
metadata:
  name: kube-apiserver
  namespace: kube-system
spec:
  containers:
    -
      command:
        - /hyperkube
        - apiserver
        - "--bind-address=0.0.0.0"
        - "--etcd-servers=https://37.139.31.151:2379"
        - "--etcd-cafile=/etc/kubernetes/ssl/ca.pem"
        - "--etcd-certfile=/etc/kubernetes/ssl/worker.pem"
        - "--etcd-keyfile=/etc/kubernetes/ssl/worker-key.pem"
        - "--allow-privileged=true"
        - "--service-cluster-ip-range=10.3.0.0/24"
        - "--secure-port=443"
        - "--advertise-address=37.139.31.151"
        - "--admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota"
        - "--tls-cert-file=/etc/kubernetes/ssl/apiserver.pem"
        - "--tls-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem"
        - "--client-ca-file=/etc/kubernetes/ssl/ca.pem"
        - "--service-account-key-file=/etc/kubernetes/ssl/apiserver-key.pem"
        - "--runtime-config=extensions/v1beta1=true,extensions/v1beta1/thirdpartyresources=true"
      image: "quay.io/coreos/hyperkube:v1.2.4_coreos.1"
      name: kube-apiserver
      ports:
        -
          containerPort: 443
          hostPort: 443
          name: https
        -
          containerPort: 8080
          hostPort: 8080
          name: local
      volumeMounts:
        -
          mountPath: /etc/kubernetes/ssl
          name: ssl-certs-kubernetes
          readOnly: true
        -
          mountPath: /etc/ssl/certs
          name: ssl-certs-host
          readOnly: true
  hostNetwork: true
  volumes:
    -
      hostPath:
        path: /etc/kubernetes/ssl
      name: ssl-certs-kubernetes
    -
      hostPath:
        path: /usr/share/ca-certificates
      name: ssl-certs-host

```

core@helena-coreos ~ $ cat /etc/kubernetes/manifests/kube-proxy.yaml
apiVersion: v1
kind: Pod
metadata:
  name: kube-proxy
  namespace: kube-system
spec:
  hostNetwork: true
  containers:
  - name: kube-proxy
    image: "quay.io/coreos/hyperkube:v1.2.4_coreos.1"
    command:
  - /hyperkube
  - proxy
  - "--master=http://127.0.0.1:8080"
  - "--proxy-mode=iptables"
  securityContext:
    privileged: true
  volumeMounts:
  - mountPath: /etc/ssl/certs
    name: ssl-certs-host
    readOnly: true
volumes:
- hostPath:
  path: /usr/share/ca-certificates
name: ssl-certs-host

```

core@helena-coreos ~ $ cat /etc/kubernetes/manifests/kube-controller-manager.yaml
apiVersion: v1
kind: Pod
metadata:
  name: kube-controller-manager
  namespace: kube-system
spec:
  hostNetwork: true
  containers:
  - name: kube-controller-manager
    image: "quay.io/coreos/hyperkube:v1.2.4_coreos.1"
    command:
    - /hyperkube
    - controller-manager
    - "--master=http://127.0.0.1:8080"
    - "--leader-elect=true"
    - "--service-account-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem"
    - "--root-ca-file=/etc/kubernetes/ssl/ca.pem"
    livenessProbe:
      httpGet:
        host: "127.0.0.1"
        path: /healthz
        port: 10252
      initialDelaySeconds: 15
      timeoutSeconds: 1
    volumeMounts:
    - mountPath: /etc/kubernetes/ssl
      name: ssl-certs-kubernetes
      readOnly: true
    - mountPath: /etc/ssl/certs
      name: ssl-certs-host
      readOnly: true
  volumes:
  - hostPath:
    path: /etc/kubernetes/ssl
    name: ssl-certs-kubernetes
  - hostPath:
    path: /usr/share/ca-certificates
    name: ssl-certs-host

```

core@helena-coreos ~ $ cat /etc/kubernetes/manifests/kube-scheduler.yaml
apiVersion: v1
kind: Pod
metadata:
  name: kube-scheduler
  namespace: kube-system
spec:
  hostNetwork: true
  containers:
  - name: kube-scheduler
    image: "quay.io/coreos/hyperkube:v1.2.4_coreos.1"
    command:
  - /hyperkube
  - scheduler
  - "--master=http://127.0.0.1:8080"
  - "--leader-elect=true"
livenessProbe:
  httpGet:
    host: "127.0.0.1"
    path: /healthz
    port: 10251
  initialDelaySeconds: 15
  timeoutSeconds: 1

我没有设置Calico。

我检查了etcd中的网络配置

```

core@helena-coreos ~ $ etcdctl get /coreos.com/network/config
{"Network":"10.2.0.0/16", "Backend":{"Type":"vxlan"}}

我还检查了API是否正常工作:

```

core@helena-coreos ~ $ curl http://127.0.0.1:8080/version
{
  "major": "1",
  "minor": "2",
  "gitVersion": "v1.2.4+coreos.1",
  "gitCommit": "7f80f816ee1a23c26647aee8aecd32f0b21df754",
  "gitTreeState": "clean"
}

好的,谢谢@ Sasha Kurakin,我更进了一步:)

```

core@helena-coreos ~ $ ./kubectl describe pods kube-controller-manager-37.139.31.151 --namespace="kube-system"
Name:       kube-controller-manager-37.139.31.151
Namespace:  kube-system
Node:       37.139.31.151/37.139.31.151
Start Time: Fri, 15 Jul 2016 09:52:19 +0000
Labels:     <none>
Status:     Running
IP:     37.139.31.151
Controllers:    <none>
Containers:
  kube-controller-manager:
     Container ID:  docker://6fee488ee838f60157b071113e43182c97b4217018933453732290a4f131767d
   Image:       quay.io/coreos/hyperkube:v1.2.4_coreos.1
   Image ID:        docker://sha256:2cac344d3116165bd808b965faae6cd9d46e840b9d70b40d8e679235aa9a6507
   Port:        
   Command:
     /hyperkube
     controller-manager
     --master=http://127.0.0.1:8080
     --leader-elect=true
     --service-account-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem
     --root-ca-file=/etc/kubernetes/ssl/ca.pem
   QoS Tier:
     memory:        BestEffort
     cpu:       BestEffort
   State:       Waiting
     Reason:        CrashLoopBackOff
   Last State:      Terminated
     Reason:        Error
     Exit Code: 255
     Started:       Fri, 15 Jul 2016 14:30:40 +0000
     Finished:      Fri, 15 Jul 2016 14:30:58 +0000
   Ready:       False
   Restart Count:   12
   Liveness:        http-get http://127.0.0.1:10252/healthz delay=15s timeout=1s period=10s #success=1 #failure=3
   Environment Variables:
 Conditions:
  Type      Status
  Ready     False 
Volumes:
  ssl-certs-kubernetes:
     Type:  EmptyDir (a temporary directory that shares a pod's lifetime)
     Medium:    
  ssl-certs-host:
     Type:  EmptyDir (a temporary directory that shares a pod's lifetime)
     Medium:    
No events.

1 个答案:

答案 0 :(得分:2)

尝试运行kubectl describe pods pod_name --namespace=pods_namespace并获取更多信息

Doc