我使用Vagrant部署了一些虚拟机来测试kubernetes:
master:4个CPU,4GB RAM
node-1:4个CPU,8GB RAM
基本图像:Centos / 7。
网络:桥接。
主机操作系统:Centos 7.2
按照kubeadm getting started guide使用kubeadm部署kubernem。在将节点添加到群集并安装Weave Net之后,遗憾的是,由于它仍处于ContainerCreating状态,因此无法启动并运行kube-dns:
[vagrant@master ~]$ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system etcd-master 1/1 Running 0 1h kube-system kube-apiserver-master 1/1 Running 0 1h kube-system kube-controller-manager-master 1/1 Running 0 1h kube-system kube-discovery-982812725-0tiiy 1/1 Running 0 1h kube-system kube-dns-2247936740-46rcz 0/3 ContainerCreating 0 1h kube-system kube-proxy-amd64-4d8s7 1/1 Running 0 1h kube-system kube-proxy-amd64-sqea1 1/1 Running 0 1h kube-system kube-scheduler-master 1/1 Running 0 1h kube-system weave-net-h1om2 2/2 Running 0 1h kube-system weave-net-khebq 1/2 CrashLoopBackOff 17 1h
我认为问题与CrashloopBackoff状态的weave-net pod有关,它位于node-1上:
[vagrant@master ~]$ kubectl describe pods --namespace=kube-system weave-net-khebq
Name: weave-net-khebq
Namespace: kube-system
Node: node-1/10.0.2.15
Start Time: Wed, 05 Oct 2016 07:10:39 +0000
Labels: name=weave-net
Status: Running
IP: 10.0.2.15
Controllers: DaemonSet/weave-net
Containers:
weave:
Container ID: docker://4976cd0ec6f971397aaf7fbfd746ca559322ab3d8f4ee217dd6c8bd3f6ed4f76
Image: weaveworks/weave-kube:1.7.0
Image ID: docker://sha256:1ac5304168bd9dd35c0ecaeb85d77d26c13a7d077aa8629b2a1b4e354cdffa1a
Port:
Command:
/home/weave/launch.sh
Requests:
cpu: 10m
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Wed, 05 Oct 2016 08:18:51 +0000
Finished: Wed, 05 Oct 2016 08:18:51 +0000
Ready: False
Restart Count: 18
Liveness: http-get http://127.0.0.1:6784/status delay=30s timeout=1s period=10s #success=1 #failure=3
Volume Mounts:
/etc from cni-conf (rw)
/host_home from cni-bin2 (rw)
/opt from cni-bin (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-kir36 (ro)
/weavedb from weavedb (rw)
Environment Variables:
WEAVE_VERSION: 1.7.0
weave-npc:
Container ID: docker://feef7e7436d2565182d99c9021958619f65aff591c576a0c240ac0adf9c66a0b
Image: weaveworks/weave-npc:1.7.0
Image ID: docker://sha256:4d7f0bd7c0e63517a675e352146af7687a206153e66bdb3d8c7caeb54802b16a
Port:
Requests:
cpu: 10m
State: Running
Started: Wed, 05 Oct 2016 07:11:04 +0000
Ready: True
Restart Count: 0
Volume Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-kir36 (ro)
Environment Variables: <none>
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
weavedb:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
cni-bin:
Type: HostPath (bare host directory volume)
Path: /opt
cni-bin2:
Type: HostPath (bare host directory volume)
Path: /home
cni-conf:
Type: HostPath (bare host directory volume)
Path: /etc
default-token-kir36:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-kir36
QoS Class: Burstable
Tolerations: dedicated=master:Equal:NoSchedule
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
1h 3m 19 {kubelet node-1} spec.containers{weave} Normal Pulling pulling image "weaveworks/weave-kube:1.7.0"
1h 3m 19 {kubelet node-1} spec.containers{weave} Normal Pulled Successfully pulled image "weaveworks/weave-kube:1.7.0"
55m 3m 11 {kubelet node-1} spec.containers{weave} Normal Created (events with common reason combined)
55m 3m 11 {kubelet node-1} spec.containers{weave} Normal Started (events with common reason combined)
1h 14s 328 {kubelet node-1} spec.containers{weave} Warning BackOff Back-off restarting failed docker container
1h 14s 300 {kubelet node-1} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "weave" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=weave pod=weave-net-khebq_kube-system(d1feb9c1-8aca-11e6-8d4f-525400c583ad)"
列出在node-1上运行的容器
[vagrant@node-1 ~]$ sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
feef7e7436d2 weaveworks/weave-npc:1.7.0 "/usr/bin/weave-npc" About an hour ago Up About an hour k8s_weave-npc.e6299282_weave-net-khebq_kube-system_d1feb9c1-8aca-11e6-8d4f-525400c583ad_0f0517cf
762cd80d491e gcr.io/google_containers/pause-amd64:3.0 "/pause" About an hour ago Up About an hour k8s_POD.d8dbe16c_weave-net-khebq_kube-system_d1feb9c1-8aca-11e6-8d4f-525400c583ad_cda766ac
8c3395959d0e gcr.io/google_containers/kube-proxy-amd64:v1.4.0 "/usr/local/bin/kube-" About an hour ago Up About an hour k8s_kube-proxy.64a0bb96_kube-proxy-amd64-4d8s7_kube-system_909e6ae1-8aca-11e6-8d4f-525400c583ad_48e7eb9a
d0fbb716bbf3 gcr.io/google_containers/pause-amd64:3.0 "/pause" About an hour ago Up About an hour k8s_POD.d8dbe16c_kube-proxy-amd64-4d8s7_kube-system_909e6ae1-8aca-11e6-8d4f-525400c583ad_d6b232ea
第一个容器的日志显示一些连接错误:
[vagrant@node-1 ~]$ sudo docker logs feef7e7436d2
E1005 08:46:06.368703 1 reflector.go:214] /home/awh/workspace/weave-npc/cmd/weave-npc/main.go:154: Failed to list *api.Pod: Get https://100.64.0.1:443/api/v1/pods?resourceVersion=0: dial tcp 100.64.0.1:443: getsockopt: connection refused
E1005 08:46:06.370119 1 reflector.go:214] /home/awh/workspace/weave-npc/cmd/weave-npc/main.go:155: Failed to list *extensions.NetworkPolicy: Get https://100.64.0.1:443/apis/extensions/v1beta1/networkpolicies?resourceVersion=0: dial tcp 100.64.0.1:443: getsockopt: connection refused
E1005 08:46:06.473779 1 reflector.go:214] /home/awh/workspace/weave-npc/cmd/weave-npc/main.go:153: Failed to list *api.Namespace: Get https://100.64.0.1:443/api/v1/namespaces?resourceVersion=0: dial tcp 100.64.0.1:443: getsockopt: connection refused
E1005 08:46:07.370451 1 reflector.go:214] /home/awh/workspace/weave-npc/cmd/weave-npc/main.go:154: Failed to list *api.Pod: Get https://100.64.0.1:443/api/v1/pods?resourceVersion=0: dial tcp 100.64.0.1:443: getsockopt: connection refused
E1005 08:46:07.371308 1 reflector.go:214] /home/awh/workspace/weave-npc/cmd/weave-npc/main.go:155: Failed to list *extensions.NetworkPolicy: Get https://100.64.0.1:443/apis/extensions/v1beta1/networkpolicies?resourceVersion=0: dial tcp 100.64.0.1:443: getsockopt: connection refused
E1005 08:46:07.474991 1 reflector.go:214] /home/awh/workspace/weave-npc/cmd/weave-npc/main.go:153: Failed to list *api.Namespace: Get https://100.64.0.1:443/api/v1/namespaces?resourceVersion=0: dial tcp 100.64.0.1:443: getsockopt: connection refused
我缺乏kubernetes和容器网络的经验来进一步解决这些问题,因此非常感谢一些提示。 观察:所有pod /节点将其IP报告为10.0.2.15,这是本地Vagrant NAT地址,而不是VM的实际IP地址。
答案 0 :(得分:12)
这是适合我的食谱(截至2017年3月19日,使用Vagrant和VirtualBox)。该集群由3个节点组成,1个主节点和2个节点。
1)确保在init
上明确设置主节点的IPkubeadm init --api-advertise-addresses=10.30.3.41
2)手动或在配置期间,向每个节点/etc/hosts
添加您要配置的确切IP。以下是您可以在Vagrant文件中添加的行(我使用的节点命名约定:k8node- $ i):
config.vm.provision :shell, inline: "sed 's/127\.0\.0\.1.*k8node.*/10.30.3.4#{i} k8node-#{i}/' -i /etc/hosts"
示例:
vagrant@k8node-1:~$ cat /etc/hosts
10.30.3.41 k8node-1
127.0.0.1 localhost
3)最后,所有节点都会尝试使用群集的公共IP连接到主服务器(不确定为什么会发生这种情况......)。这是解决方法。
首先,通过在master上运行以下命令来查找公共IP。
kubectl get svc
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes 10.96.0.1 <none> 443/TCP 1h
在每个节点中,确保使用10.96.0.1(在我的情况下)的任何进程都路由到10.30.3.41上的master。
因此,在每个节点上(您可以跳过主节点)使用route
来设置重定向。
route add 10.96.0.1 gw 10.30.3.41
之后,一切都应该正常工作:
vagrant@k8node-1:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system dummy-2088944543-rnl2f 1/1 Running 0 1h
kube-system etcd-k8node-1 1/1 Running 0 1h
kube-system kube-apiserver-k8node-1 1/1 Running 0 1h
kube-system kube-controller-manager-k8node-1 1/1 Running 0 1h
kube-system kube-discovery-1769846148-g8g85 1/1 Running 0 1h
kube-system kube-dns-2924299975-7wwm6 4/4 Running 0 1h
kube-system kube-proxy-9dxsb 1/1 Running 0 46m
kube-system kube-proxy-nx63x 1/1 Running 0 1h
kube-system kube-proxy-q0466 1/1 Running 0 1h
kube-system kube-scheduler-k8node-1 1/1 Running 0 1h
kube-system weave-net-2nc8d 2/2 Running 0 46m
kube-system weave-net-2tphv 2/2 Running 0 1h
kube-system weave-net-mp6s0 2/2 Running 0 1h
vagrant@k8node-1:~$ kubectl get nodes
NAME STATUS AGE
k8node-1 Ready,master 1h
k8node-2 Ready 1h
k8node-3 Ready 48m
答案 1 :(得分:0)