我真的很难让kubernetes在CoreOS上工作,我最大的问题是找到正确的信息来调试以下问题。
(我按照本教程https://coreos.com/kubernetes/docs/latest/getting-started.html)
```
core@helena-coreos ~ $ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
423aa16a66e7 gcr.io/google_containers/pause:2.0 "/pause" 2 hours ago Up 2 hours k8s_POD.6059dfa2_kube-controller-manager-37.139.31.151_kube-system_f11896dcf9adf655df092f2a12a41673_ec25db0a
4b456d7cf17d quay.io/coreos/hyperkube:v1.2.4_coreos.1 "/hyperkube apiserver" 3 hours ago Up 3 hours k8s_kube-apiserver.33667886_kube-apiserver-37.139.31.151_kube-system_bfdfe85e7787a05e49ebfe95e7d4a401_abd7982f
52e25d838af3 gcr.io/google_containers/pause:2.0 "/pause" 3 hours ago Up 3 hours k8s_POD.6059dfa2_kube-apiserver-37.139.31.151_kube-system_bfdfe85e7787a05e49ebfe95e7d4a401_411b1a93
显然,kube-controller-manager存在问题。调试信息的两个主要来源(据我所知)是日志和泊坞日志。
docker日志根本不显示任何内容。 ```
core@helena-coreos ~ $ docker logs 423aa16a66e7
core@helena-coreos ~ $
我还尝试使用docker exec登录此容器,但这也无效。
所以我的希望是基于期刊。
```
Jul 15 13:16:59 helena-coreos kubelet-wrapper[2318]: I0715 13:16:59.143892 2318 manager.go:2050] Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-37.139.31.151_kube-system(f11896dcf9adf655df092f2a12a41673)
Jul 15 13:16:59 helena-coreos kubelet-wrapper[2318]: E0715 13:16:59.143992 2318 pod_workers.go:138] Error syncing pod f11896dcf9adf655df092f2a12a41673, skipping: failed to "StartContainer" for "kube-controller-manager" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-37.139.31.151_kube-system(f11896dcf9adf655df092f2a12a41673)"
所以kube-controller-manager无法启动,但我不知道为什么。我该如何解决这个问题?
我转发了以下配置:
```
core@helena-coreos ~ $ cat /etc/flannel/options.env
FLANNELD_IFACE=37.139.31.151
FLANNELD_ETCD_ENDPOINTS=https://37.139.31.151:2379
FLANNELD_ETCD_CAFILE=/etc/ssl/etcd/ca.pem
FLANNELD_ETCD_CERTFILE=/etc/ssl/etcd/coreos.pem
FLANNELD_ETCD_KEYFILE=/etc/ssl/etcd/coreos-key.pem
```
core@helena-coreos ~ $ cat /etc/systemd/system/flanneld.service.d/40-ExecStartPre-symlink.conf
[Service]
ExecStartPre=/usr/bin/ln -sf /etc/flannel/options.env /run/flannel/options.env
```
core@helena-coreos ~ $ cat /etc/systemd/system/docker.service.d/35-flannel.conf
[Unit]
Requires=flanneld.service
After=flanneld.service
```
core@helena-coreos ~ $ cat /etc/systemd/system/kubelet.service
[Service]
ExecStartPre=/usr/bin/mkdir -p /etc/kubernetes/manifests
Environment=KUBELET_VERSION=v1.2.4_coreos.cni.1
ExecStart=/usr/lib/coreos/kubelet-wrapper \
--api-servers=http://127.0.0.1:8080 \
--network-plugin-dir=/etc/kubernetes/cni/net.d \
--network-plugin= \
--register-schedulable=false \
--allow-privileged=true \
--config=/etc/kubernetes/manifests \
--hostname-override=37.139.31.151 \
--cluster-dns=10.3.0.10 \
--cluster-domain=cluster.local
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
```
core@helena-coreos ~ $ cat /etc/kubernetes/manifests/kube-apiserver.yaml
apiVersion: v1
kind: Pod
metadata:
name: kube-apiserver
namespace: kube-system
spec:
containers:
-
command:
- /hyperkube
- apiserver
- "--bind-address=0.0.0.0"
- "--etcd-servers=https://37.139.31.151:2379"
- "--etcd-cafile=/etc/kubernetes/ssl/ca.pem"
- "--etcd-certfile=/etc/kubernetes/ssl/worker.pem"
- "--etcd-keyfile=/etc/kubernetes/ssl/worker-key.pem"
- "--allow-privileged=true"
- "--service-cluster-ip-range=10.3.0.0/24"
- "--secure-port=443"
- "--advertise-address=37.139.31.151"
- "--admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota"
- "--tls-cert-file=/etc/kubernetes/ssl/apiserver.pem"
- "--tls-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem"
- "--client-ca-file=/etc/kubernetes/ssl/ca.pem"
- "--service-account-key-file=/etc/kubernetes/ssl/apiserver-key.pem"
- "--runtime-config=extensions/v1beta1=true,extensions/v1beta1/thirdpartyresources=true"
image: "quay.io/coreos/hyperkube:v1.2.4_coreos.1"
name: kube-apiserver
ports:
-
containerPort: 443
hostPort: 443
name: https
-
containerPort: 8080
hostPort: 8080
name: local
volumeMounts:
-
mountPath: /etc/kubernetes/ssl
name: ssl-certs-kubernetes
readOnly: true
-
mountPath: /etc/ssl/certs
name: ssl-certs-host
readOnly: true
hostNetwork: true
volumes:
-
hostPath:
path: /etc/kubernetes/ssl
name: ssl-certs-kubernetes
-
hostPath:
path: /usr/share/ca-certificates
name: ssl-certs-host
```
core@helena-coreos ~ $ cat /etc/kubernetes/manifests/kube-proxy.yaml
apiVersion: v1
kind: Pod
metadata:
name: kube-proxy
namespace: kube-system
spec:
hostNetwork: true
containers:
- name: kube-proxy
image: "quay.io/coreos/hyperkube:v1.2.4_coreos.1"
command:
- /hyperkube
- proxy
- "--master=http://127.0.0.1:8080"
- "--proxy-mode=iptables"
securityContext:
privileged: true
volumeMounts:
- mountPath: /etc/ssl/certs
name: ssl-certs-host
readOnly: true
volumes:
- hostPath:
path: /usr/share/ca-certificates
name: ssl-certs-host
```
core@helena-coreos ~ $ cat /etc/kubernetes/manifests/kube-controller-manager.yaml
apiVersion: v1
kind: Pod
metadata:
name: kube-controller-manager
namespace: kube-system
spec:
hostNetwork: true
containers:
- name: kube-controller-manager
image: "quay.io/coreos/hyperkube:v1.2.4_coreos.1"
command:
- /hyperkube
- controller-manager
- "--master=http://127.0.0.1:8080"
- "--leader-elect=true"
- "--service-account-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem"
- "--root-ca-file=/etc/kubernetes/ssl/ca.pem"
livenessProbe:
httpGet:
host: "127.0.0.1"
path: /healthz
port: 10252
initialDelaySeconds: 15
timeoutSeconds: 1
volumeMounts:
- mountPath: /etc/kubernetes/ssl
name: ssl-certs-kubernetes
readOnly: true
- mountPath: /etc/ssl/certs
name: ssl-certs-host
readOnly: true
volumes:
- hostPath:
path: /etc/kubernetes/ssl
name: ssl-certs-kubernetes
- hostPath:
path: /usr/share/ca-certificates
name: ssl-certs-host
```
core@helena-coreos ~ $ cat /etc/kubernetes/manifests/kube-scheduler.yaml
apiVersion: v1
kind: Pod
metadata:
name: kube-scheduler
namespace: kube-system
spec:
hostNetwork: true
containers:
- name: kube-scheduler
image: "quay.io/coreos/hyperkube:v1.2.4_coreos.1"
command:
- /hyperkube
- scheduler
- "--master=http://127.0.0.1:8080"
- "--leader-elect=true"
livenessProbe:
httpGet:
host: "127.0.0.1"
path: /healthz
port: 10251
initialDelaySeconds: 15
timeoutSeconds: 1
我没有设置Calico。
我检查了etcd中的网络配置
```
core@helena-coreos ~ $ etcdctl get /coreos.com/network/config
{"Network":"10.2.0.0/16", "Backend":{"Type":"vxlan"}}
我还检查了API是否正常工作:
```
core@helena-coreos ~ $ curl http://127.0.0.1:8080/version
{
"major": "1",
"minor": "2",
"gitVersion": "v1.2.4+coreos.1",
"gitCommit": "7f80f816ee1a23c26647aee8aecd32f0b21df754",
"gitTreeState": "clean"
}
好的,谢谢@ Sasha Kurakin,我更进了一步:)
```
core@helena-coreos ~ $ ./kubectl describe pods kube-controller-manager-37.139.31.151 --namespace="kube-system"
Name: kube-controller-manager-37.139.31.151
Namespace: kube-system
Node: 37.139.31.151/37.139.31.151
Start Time: Fri, 15 Jul 2016 09:52:19 +0000
Labels: <none>
Status: Running
IP: 37.139.31.151
Controllers: <none>
Containers:
kube-controller-manager:
Container ID: docker://6fee488ee838f60157b071113e43182c97b4217018933453732290a4f131767d
Image: quay.io/coreos/hyperkube:v1.2.4_coreos.1
Image ID: docker://sha256:2cac344d3116165bd808b965faae6cd9d46e840b9d70b40d8e679235aa9a6507
Port:
Command:
/hyperkube
controller-manager
--master=http://127.0.0.1:8080
--leader-elect=true
--service-account-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem
--root-ca-file=/etc/kubernetes/ssl/ca.pem
QoS Tier:
memory: BestEffort
cpu: BestEffort
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 255
Started: Fri, 15 Jul 2016 14:30:40 +0000
Finished: Fri, 15 Jul 2016 14:30:58 +0000
Ready: False
Restart Count: 12
Liveness: http-get http://127.0.0.1:10252/healthz delay=15s timeout=1s period=10s #success=1 #failure=3
Environment Variables:
Conditions:
Type Status
Ready False
Volumes:
ssl-certs-kubernetes:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
ssl-certs-host:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
No events.