Kubernetes - 主节点中的kube-system pod在工作节点加入

时间:2018-02-27 20:49:27

标签: kubernetes kubeadm flannel weave

我已关注此tutorial以及此tutorialthis one,但过去3天我遇到同样的问题。

我可以通过以下步骤正确设置主节点:

kubeadm init

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

export kubever=$(kubectl version | base64 | tr -d ‘\’)
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$kubever"

,一切似乎都很好。

kubectl get all --namespace=kube-system

然后,

工作节点上的

kubeadm join --token 864655.fdf6d0b389867b79 192.168.100.17:6443 --discovery-token-ca-cert-hash sha256:a2d840808b17b53b9612e6271ccde489f13dbede7d354f97188d0faa9e210af2

输出似乎很好,如下所示:

[preflight] Running pre-flight checks.
  [WARNING FileExisting-crictl]: crictl not found in system path
[preflight] Starting the kubelet service
[discovery] Trying to connect to API Server "192.168.100.17:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://192.168.100.17:6443"
[discovery] Requesting info from "https://192.168.100.17:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "192.168.100.17:6443"
[discovery] Successfully established connection with API Server "192.168.100.17:6443"

This node has joined the cluster:
* Certificate signing request was sent to master and a response
  was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the master to see this node join the cluster.

但是一旦我运行这个命令,所有地狱都会松动。

kubectl get all --namespace=kube-system

开始显示所有pod都在重新启动。状态在Pending和Running之间不断变化,有时候一些pod甚至会消失,并且可能具有ContainerCreating状态等。

NAME                                READY     STATUS    RESTARTS   AGE
po/etcd-ubuntu                      0/1       Pending   0          0s
po/kube-controller-manager-ubuntu   0/1       Pending   0          0s
po/kube-dns-6f4fd4bdf-cmcfk         3/3       Running   0          13m
po/kube-proxy-2chb6                 1/1       Running   0          13m
po/kube-scheduler-ubuntu            0/1       Pending   0          0s
po/weave-net-ptdxr                  2/2       Running   0          11m

我还尝试了法兰绒的第二个教程,并得到完全相同的问题。

我的设置

我创建了两个新的虚拟机,在VMware上安装了全新的Ubuntu 17.10,每个虚拟机配备2个处理器/ 2核6 GB内存和50 GB硬盘。我的物理机器是一台i7-6700k,配备32GB的内存。 我在它们上面安装了kubeadm,kubelet和docker,然后按照上面提到的步骤进行操作。

我也尝试在VMware上切换NAT和Bridge,但没有任何改变。

具有桥接网络的两个VM的初始IP为192.168.100.12和192.168.100.17。 主人hostname -I

192.168.100.17 172.17.0.1 10.32.0.1 10.32.0.2

worker-node的hostname -I

192.168.100.12 172.17.0.1 10.44.0.0 10.32.0.1

journalctl -xeu kubelet显示以下内容:

https://gist.github.com/saad749/9a771a3460bf88c274498b5bc4b7fd84

尝试法兰绒时(仍然是同样的问题),结果来自

kubectl describe nodes

https://gist.github.com/saad749/d24c453c8b4e663e9abf572a0fb38bf4

我错过了kubeadm init之前的任何步骤吗?我应该更改IP地址(到什么地方)?我应该研究一下具体的日志吗?有更全面的教程吗? 所有问题在kubeadm加入工作节点后开始,我可以在主节点或任何其他东西上部署kubernetes,并且工作正常。

更新

即使应用了errordeveloper的建议,同样的问题仍然存在。

我将以下标志添加到kubeadm init:

--apiserver-advertise-address 192.168.100.17

我将kubeadm.conf更新为以下内容并重新加载并重新启动: https://gist.github.com/saad749/c7149c87ec3e75a40586f626cf04279a

并尝试更改群集dns https://gist.github.com/saad749/5fa66bebc22841e58119333e75600e40

初始化master后的日志:

kube-master@ubuntu:~$ kubectl get pod --all-namespaces -o wide
NAMESPACE     NAME                             READY     STATUS    RESTARTS   AGE       IP               NODE
kube-system   etcd-ubuntu                      1/1       Running   0          22s       192.168.100.17   ubuntu
kube-system   kube-apiserver-ubuntu            1/1       Running   0          29s       192.168.100.17   ubuntu
kube-system   kube-controller-manager-ubuntu   1/1       Running   0          13s       192.168.100.17   ubuntu
kube-system   kube-dns-6f4fd4bdf-wfqhb         3/3       Running   0          1m        10.32.0.7        ubuntu
kube-system   kube-proxy-h4hz9                 1/1       Running   0          1m        192.168.100.17   ubuntu
kube-system   kube-scheduler-ubuntu            1/1       Running   0          34s       192.168.100.17   ubuntu
kube-system   weave-net-fkgnh                  2/2       Running   0          32s       192.168.100.17   ubuntu

主机名-i结果:

kube-master@ubuntu:~$ hostname -I
192.168.100.17 172.17.0.1 10.32.0.1 10.32.0.2 10.32.0.3 10.32.0.4 10.32.0.5 10.32.0.6 10.244.0.0 10.244.0.1
kube-master@ubuntu:~$ hostname -i
192.168.100.17

结果来自:

kubectl describe nodes

https://gist.github.com/saad749/8f460650182a04d0ddf3158a52761a9a

内部IP现在似乎正确。

从第二个节点加入后,会发生这种情况:

kube-master@ubuntu:~$ kubectl get nodes
NAME      STATUS    ROLES     AGE       VERSION
ubuntu    Ready     master    49m       v1.9.3
kube-master@ubuntu:~$ kubectl get pod --all-namespaces -o wide
NAMESPACE     NAME                             READY     STATUS              RESTARTS   AGE       IP               NODE
kube-system   kube-controller-manager-ubuntu   0/1       Pending             0          0s        <none>           ubuntu
kube-system   kube-dns-6f4fd4bdf-wfqhb         0/3       ContainerCreating   0          49m       <none>           ubuntu
kube-system   kube-proxy-h4hz9                 1/1       Running             0          49m       192.168.100.17   ubuntu
kube-system   kube-scheduler-ubuntu            1/1       Running             0          1s        192.168.100.17   ubuntu
kube-system   weave-net-fkgnh                  2/2       Running             0          48m       192.168.100.17   ubuntu

ifconfig -a results:

https://gist.github.com/saad749/63a5a52bd3246ff72477b2aca7d158d0

journalctl -xeu kubelet结果

https://gist.github.com/saad749/8a60870b35f93df8565e66cb208aff32

有时,pods IP显示在192.168.100.12,这是非主第二节点的IP。

kube-master@ubuntu:~$ kubectl get pod --all-namespaces -o wide
NAMESPACE     NAME                             READY     STATUS    RESTARTS   AGE       IP               NODE
kube-system   etcd-ubuntu                      0/1       Pending   0          0s        <none>           ubuntu
kube-system   kube-apiserver-ubuntu            0/1       Pending   0          0s        <none>           ubuntu
kube-system   kube-controller-manager-ubuntu   1/1       Running   0          0s        192.168.100.12   ubuntu
kube-system   kube-dns-6f4fd4bdf-wfqhb         2/3       Running   0          3h        10.32.0.7        ubuntu
kube-system   kube-proxy-h4hz9                 1/1       Running   0          3h        192.168.100.12   ubuntu
kube-system   kube-scheduler-ubuntu            0/1       Pending   0          0s        <none>           ubuntu
kube-system   weave-net-fkgnh                  2/2       Running   1          3h        192.168.100.17   ubuntu

kube-master@ubuntu:~$ kubectl get pod --all-namespaces -o wide
NAMESPACE     NAME                       READY     STATUS    RESTARTS   AGE       IP               NODE
kube-system   kube-dns-6f4fd4bdf-wfqhb   3/3       Running   0          3h        10.32.0.7        ubuntu
kube-system   kube-proxy-h4hz9           1/1       Running   0          3h        192.168.100.12   ubuntu
kube-system   weave-net-fkgnh            2/2       Running   0          3h        192.168.100.12   ubuntu


kubectl describe nodes
Name:               ubuntu
Roles:              master
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/hostname=ubuntu
                    node-role.kubernetes.io/master=
Annotations:        node.alpha.kubernetes.io/ttl=0
                    volumes.kubernetes.io/controller-managed-attach-detach=true
Taints:             node-role.kubernetes.io/master:NoSchedule
CreationTimestamp:  Fri, 02 Mar 2018 08:21:47 -0800
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  OutOfDisk        False   Fri, 02 Mar 2018 11:38:36 -0800   Fri, 02 Mar 2018 08:21:43 -0800   KubeletHasSufficientDisk     kubelet has sufficient disk space available
  MemoryPressure   False   Fri, 02 Mar 2018 11:38:36 -0800   Fri, 02 Mar 2018 08:21:43 -0800   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Fri, 02 Mar 2018 11:38:36 -0800   Fri, 02 Mar 2018 08:21:43 -0800   KubeletHasNoDiskPressure     kubelet has no disk pressure
  Ready            True    Fri, 02 Mar 2018 11:38:36 -0800   Fri, 02 Mar 2018 11:28:25 -0800   KubeletReady                 kubelet is posting ready status. AppArmor enabled
Addresses:
  InternalIP:  192.168.100.12
  Hostname:    ubuntu
Capacity:
 cpu:     4
 memory:  6080832Ki
 pods:    110
Allocatable:
 cpu:     4
 memory:  5978432Ki
 pods:    110
System Info:
 Machine ID:                 59bf65b835b242a3aa182f4b8a542219
 System UUID:                0C3C4D56-4747-D59E-EE09-F16F2793677E
 Boot ID:                    658b4a08-d724-425e-9246-2b41995ecc46
 Kernel Version:             4.13.0-36-generic
 OS Image:                   Ubuntu 17.10
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://1.13.1
 Kubelet Version:            v1.9.3
 Kube-Proxy Version:         v1.9.3
ExternalID:                  ubuntu
Non-terminated Pods:         (3 in total)
  Namespace                  Name                        CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ---------                  ----                        ------------  ----------  ---------------  -------------
  kube-system                kube-dns-6f4fd4bdf-wfqhb    260m (6%)     0 (0%)      110Mi (1%)       170Mi (2%)
  kube-system                kube-proxy-h4hz9            0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-system                weave-net-fkgnh             20m (0%)      0 (0%)      0 (0%)           0 (0%)
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ------------  ----------  ---------------  -------------
  280m (7%)     0 (0%)      110Mi (1%)       170Mi (2%)
Events:
  Type     Reason                   Age                 From             Message
  ----     ------                   ----                ----             -------
  Warning  Rebooted                 12m (x814 over 2h)  kubelet, ubuntu  Node ubuntu has been rebooted, boot id: 16efd500-a2a5-446f-ba25-1187857996e0
  Normal   NodeHasNoDiskPressure    10m                 kubelet, ubuntu  Node ubuntu status is now: NodeHasNoDiskPressure
  Normal   Starting                 10m                 kubelet, ubuntu  Starting kubelet.
  Normal   NodeAllocatableEnforced  10m                 kubelet, ubuntu  Updated Node Allocatable limit across pods
  Normal   NodeHasSufficientDisk    10m                 kubelet, ubuntu  Node ubuntu status is now: NodeHasSufficientDisk
  Normal   NodeHasSufficientMemory  10m                 kubelet, ubuntu  Node ubuntu status is now: NodeHasSufficientMemory
  Normal   NodeNotReady             10m                 kubelet, ubuntu  Node ubuntu status is now: NodeNotReady
  Warning  Rebooted                 2m (x870 over 2h)   kubelet, ubuntu  Node ubuntu has been rebooted, boot id: 658b4a08-d724-425e-9246-2b41995ecc46
  Warning  Rebooted                 15s (x60 over 10m)  kubelet, ubuntu  Node ubuntu has been rebooted, boot id: 16efd500-a2a5-446f-ba25-1187857996e0

我做错了什么?

2 个答案:

答案 0 :(得分:4)

因此,在遵循@errordeveloper的建议并且仍然碰壁之后,我能够解决问题,结果非常简单。

我的VM都有相同的主机名。

hostname -f 

将返回

ubuntu

两者,显然会导致kubernetes出现问题。

我用

更改了非主节点上的名称
hostnamectl set-hostname kminion

并在以下文件中:

/etc/hostname
/etc/hosts

一切顺利进行!

答案 1 :(得分:1)

  

我应该更改IP地址(到什么地方)?

是的,这通常是在默认路由用于NAT访问Internet的虚拟机上运行的方法。

您希望使用桥接网络的IP,为您掌握192.168.100.17(但请仔细检查)。

首先,请尝试使用kubeadm init --apiserver-advertise-address 192.168.100.17,但这可能无法解决所有问题。

kubectl describe nodes的输出中,我可以看到这个

Addresses:
  InternalIP:  172.17.0.1
  Hostname:    ubuntu

所以你可能想确保kubelet也没有使用NATed接口,你需要使用kubelet的--node-ip标志。

但是,还有其他方法可以解决此问题,例如:如果你能确保hostname -i返回桥接接口的IP(你可以通过调整/etc/hosts来完成)。