重启系统后,Kubernetes无法启动(Ubuntu)

时间:2017-04-25 06:48:15

标签: ubuntu kubernetes

我在VirtualBox(Master和Node01)的两个Ubuntus上安装了K8。安装完成后(我按照K8s doc网站进行了操作)我输入了kubectl get nodes并让机器人服务器处于状态 Ready 。但重新启动系统后,我得到了这个:

# kubectl get nodes 
The connection to the server localhost:8080 was refused - did you specify the 
right host or port? 

我检查了kubelet服务,它正在运行:

# systemctl status kubelet
kubelet.service - kubelet: The Kubernetes Node Agent 
   Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled) 
  Drop-In: /etc/systemd/system/kubelet.service.d 
           └─10-kubeadm.conf 
   Active: active (running) since Mon 2017-04-24 10:01:51 CEST; 15min ago 
     Docs: http://kubernetes.io/docs/ 
Main PID: 13128 (kubelet) 
    Tasks: 21 
   Memory: 48.2M 
      CPU: 58.014s 
   CGroup: /system.slice/kubelet.service 
           ├─13128 /usr/bin/kubelet --kubeconfig=/etc/kubernetes/kubelet.conf --require-kubeconfig=true --pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true --cluster-dns=10.96.0.10 --cluster-domain=cluster.local 
           └─13164 journalctl -k -f 

Apr 24 10:16:40 master kubelet[13128]: I0424 10:16:40.204156   13128 kuberuntime_manager.go:752] Back-off 5m0s restarting failed container=weave pod=weave-net-5qgvz_kube-system(4b7bb2f0-2691-11e7-bfb6-080027229776) 
Apr 24 10:16:40 master kubelet[13128]: E0424 10:16:40.204694   13128 pod_workers.go:182] Error syncing pod 4b7bb2f0-2691-11e7-bfb6-080027229776 ("weave-net-5qgvz_kube-system(4b7bb2f0-2691-11e7-bfb6-080027229776)"), skipping: fail 
Apr 24 10:16:42 master kubelet[13128]: I0424 10:16:42.972302   13128 operation_generator.go:597] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/2b59d0d9-2692-11e7-bfb6-080027229776-default-token-h3v7c" (spec.Name: " 
Apr 24 10:16:48 master kubelet[13128]: I0424 10:16:48.949731   13128 operation_generator.go:597] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/2bb42bc1-2692-11e7-bfb6-080027229776-default-token-h3v7c" (spec.Name: " 
Apr 24 10:16:51 master kubelet[13128]: I0424 10:16:51.978663   13128 operation_generator.go:597] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/2b023c31-2692-11e7-bfb6-080027229776-default-token-h3v7c" (spec.Name: " 
Apr 24 10:16:52 master kubelet[13128]: I0424 10:16:52.909589   13128 operation_generator.go:597] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/4b7bb2f0-2691-11e7-bfb6-080027229776-default-token-gslqd" (spec.Name: " 
Apr 24 10:16:53 master kubelet[13128]: I0424 10:16:53.186057   13128 kuberuntime_manager.go:458] Container {Name:weave Image:weaveworks/weave-kube:1.9.4 Command:[/home/weave/launch.sh] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env: 
Apr 24 10:16:53 master kubelet[13128]: I0424 10:16:53.188091   13128 kuberuntime_manager.go:742] checking backoff for container "weave" in pod "weave-net-5qgvz_kube-system(4b7bb2f0-2691-11e7-bfb6-080027229776)" 
Apr 24 10:16:53 master kubelet[13128]: I0424 10:16:53.188717   13128 kuberuntime_manager.go:752] Back-off 5m0s restarting failed container=weave pod=weave-net-5qgvz_kube-system(4b7bb2f0-2691-11e7-bfb6-080027229776) 
Apr 24 10:16:53 master kubelet[13128]: E0424 10:16:53.189136   13128 pod_workers.go:182] Error syncing pod 4b7bb2f0-2691-11e7-bfb6-080027229776 ("weave-net-5qgvz_kube-system(4b7bb2f0-2691-11e7-bfb6-080027229776)"), skipping: fail 

这是带有重新启动的kubelet的systemd日志文件:Google Drive

......我不确定我在doc中错过了什么或者在kubelet中发生了什么。我可以请你帮忙吗? :]

•Ubuntu版本

cat /etc/os-release 
NAME="Ubuntu" 
VERSION="16.04.2 LTS (Xenial Xerus)" 
ID=ubuntu 
ID_LIKE=debian 
PRETTY_NAME="Ubuntu 16.04.2 LTS" 
VERSION_ID="16.04" 
HOME_URL="http://www.ubuntu.com/" 
SUPPORT_URL="http://help.ubuntu.com/" 
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/" 
VERSION_CODENAME=xenial 
UBUNTU_CODENAME=xenial 

•内核

# uname -a 
Linux ubuntu 4.4.0-72-generic #93-Ubuntu SMP Fri Mar 31 14:07:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux 

•Kubectl版本

# kubectl version 
Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.1", GitCommit:"b0b7a323cc5a4a2019b2e9520c21c7830b7f708e", GitTreeState:"clean", BuildDate:"2017-04-03T20:44:38Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"} 
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.0", GitCommit:"fff5156092b56e6bd60fff75aad4dc9de6b6ef37", GitTreeState:"clean", BuildDate:"2017-03-28T16:24:30Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"} 

•Kubeadm版

# kubeadm version 
kubeadm version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.1", GitCommit:"b0b7a323cc5a4a2019b2e9520c21c7830b7f708e", GitTreeState:"clean", BuildDate:"2017-04-03T20:33:27Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"} 

•Kubelet版本

# kubelet --version 
Kubernetes v1.6.1 

•Docker版本

# docker version 
Client: 
Version:      1.11.2 
API version:  1.23 
Go version:   go1.5.4 
Git commit:   b9f10c9 
Built:        Wed Jun  1 22:00:43 2016 
OS/Arch:      linux/amd64 

Server: 
Version:      1.11.2 
API version:  1.23 
Go version:   go1.5.4 
Git commit:   b9f10c9 
Built:        Wed Jun  1 22:00:43 2016 
OS/Arch:      linux/amd64 

4 个答案:

答案 0 :(得分:2)

我有一个糟糕的导出变量KUBECONFIG,这是kubelet需要的(历史详细信息在评论中)。至~/.zprofile我保存了 KUBECONFIG = $ HOME / admin.conf ,这解决了我的问题。

重新加载ENV变量后,kubelet工作:

# kubectl get nodes                                          
NAME      STATUS     AGE       VERSION
master    Ready      5d        v1.6.1
node01    NotReady   5d        v1.6.1

答案 1 :(得分:2)

我在kubernetes 1.12.3和ubuntu 16.04.05上遇到了相同的问题。然后,我通过运行命令查看了kubernetes日志

$ journalctl -u kubelet

然后在日志中,我看到k8s在抱怨(退出状态255)正在进行交换。

所以我随后通过运行关闭了交换

$ swapoff -a

然后我编辑了fstab并注释掉了要交换的条目

$ vi /etc/fstab
#comment out line with swap

,然后重新引导系统。 系统恢复后,我验证了交换已通过运行禁用

$ free -m

并检查交换行是否为0。

然后我通过执行验证了kubeapi服务已成功启动

$ systemctl status kubelet

它已成功启动。我还通过重新检查journalctl日志进行了验证。这次没有看到交换错误。

我通过运行

验证了k8s节点状态

$ kubectl get nodes

现在正在运行并显示预期的输出。

注意:之前,我也在.bash_profile文件中设置了KUBECONFIG。

root@k8s-master:~# cat .bash_profile
export KUBECONFIG="/etc/kubernetes/admin.conf"

答案 2 :(得分:1)

作为评论,您确实需要检查apiserver是否已启动,因为kubectl会与apiserver对话。虽然你的描述和版本的kubeadm,我相信这是一个重复question我刚回答,所以我只是将答案复制到这里。

在当前版本的kubeadm(v1.6.1)中,默认情况下放弃了ApiServer的不安全端口,您可以通过检查/etc/kubernetes/manifests/kube-apiserver.yaml中的api-server yaml文件来验证这一点,还有kube-apiserver参数--insecure-port=0

你可以

  • 在正在运行的群集中更正此问题:

    $ mv kube-apiserver.yaml ../kube-apiserver.yaml
    // edit ../kube-apiserver.yaml to remove --insecure-port=0 
    // or change it to --insecure-port=<WHATERER_YOUR_LIKE>
    $ mv ../kube-apiserver.yaml kube-apiserver.yaml
    
  • 在启动时正确执行。您需要一个kubeadm配置文件来执行此操作。一个简单的想法:

    apiVersion: kubeadm.k8s.io/v1alpha1
    kind: MasterConfiguration
    apiServerExtraArgs:
      insecure-port: 8080 //or whatever you like
    
    // Then you can start a master node use `kubeadm init --config=<this-configure-file-path>`
    

答案 3 :(得分:0)

我不是专家,但我发现问题在于 API 服务器等的容器无法启动。

我发现这是一个 docker unix sock 权限问题:

$ chmod 666 /var/run/docker.sock
$ sudo systemctl restart docker

这为我解决了问题