Kubelet无法启动:退出状态崩溃:255 / n / a

时间:2018-06-22 06:59:46

标签: kubernetes kubelet

Ubuntu 16.04 LTS,Docker 17.12.1,Kubernetes 1.10.0

Kubelet无法启动:

6月22日06:45:57 dev-master systemd [1]:kubelet.service:主进程已退出,代码已退出,状态为255 / n / a

6月22日06:45:57 dev-master systemd [1]:kubelet.service:失败,返回结果为“退出代码”。

注意:v1.9.1没问题

日志:

Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.518085   20051 docker_service.go:249] Docker Info: &{ID:WDJK:3BCI:BGCM:VNF3:SXGW:XO5G:KJ3Z:EKIH:XGP7:XJGG:LFBL:YWAJ Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStopped:0 Images:1 Driver:btrfs DriverStatus:[[Build Version Btrfs v4.15.1] [Library Vers
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.521232   20051 docker_service.go:262] Setting cgroupDriver to cgroupfs
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.532834   20051 remote_runtime.go:43] Connecting to runtime service unix:///var/run/dockershim.sock
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.533812   20051 kuberuntime_manager.go:186] Container runtime docker initialized, version: 18.05.0-ce, apiVersion: 1.37.0
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.534071   20051 csi_plugin.go:61] kubernetes.io/csi: plugin initializing...
Jun 22 06:45:55 dev-master hyperkube[20051]: W0622 06:45:55.534846   20051 kubelet.go:903] Accelerators feature is deprecated and will be removed in v1.11. Please use device plugins instead. They can be enabled using the DevicePlugins feature gate.
Jun 22 06:45:55 dev-master hyperkube[20051]: W0622 06:45:55.535035   20051 kubelet.go:909] GPU manager init error: couldn't get a handle to the library: unable to open a handle to the library, GPU feature is disabled.
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.535082   20051 server.go:129] Starting to listen on 0.0.0.0:10250
Jun 22 06:45:55 dev-master hyperkube[20051]: E0622 06:45:55.535164   20051 kubelet.go:1282] Image garbage collection failed once. Stats initialization may not have completed yet: failed to get imageFs info: unable to find data for container /
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.535189   20051 server.go:944] Started kubelet
Jun 22 06:45:55 dev-master hyperkube[20051]: E0622 06:45:55.535555   20051 event.go:209] Unable to write event: 'Post https://10.50.50.201:8001/api/v1/namespaces/default/events: dial tcp 10.50.50.201:8001: getsockopt: connection refused' (may retry after sleeping)
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.535825   20051 fs_resource_analyzer.go:66] Starting FS ResourceAnalyzer
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.536202   20051 status_manager.go:140] Starting to sync pod status with apiserver
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.536253   20051 kubelet.go:1782] Starting kubelet main sync loop.
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.536285   20051 kubelet.go:1799] skipping pod synchronization - [container runtime is down PLEG is not healthy: pleg was last seen active 2562047h47m16.854775807s ago; threshold is 3m0s]
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.536464   20051 volume_manager.go:247] Starting Kubelet Volume Manager
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.536613   20051 desired_state_of_world_populator.go:129] Desired state populator starts to run
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.538574   20051 server.go:299] Adding debug handlers to kubelet server.
Jun 22 06:45:55 dev-master hyperkube[20051]: W0622 06:45:55.538664   20051 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
Jun 22 06:45:55 dev-master hyperkube[20051]: E0622 06:45:55.539199   20051 kubelet.go:2130] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.636465   20051 kubelet.go:1799] skipping pod synchronization - [container runtime is down]
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.636795   20051 kubelet_node_status.go:289] Setting node annotation to enable volume controller attach/detach
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.638630   20051 kubelet_node_status.go:83] Attempting to register node 10.50.50.201
Jun 22 06:45:55 dev-master hyperkube[20051]: E0622 06:45:55.638954   20051 kubelet_node_status.go:107] Unable to register node "10.50.50.201" with API server: Post https://10.50.50.201:8001/api/v1/nodes: dial tcp 10.50.50.201:8001: getsockopt: connection refused
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.836686   20051 kubelet.go:1799] skipping pod synchronization - [container runtime is down]
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.839219   20051 kubelet_node_status.go:289] Setting node annotation to enable volume controller attach/detach
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.841028   20051 kubelet_node_status.go:83] Attempting to register node 10.50.50.201
Jun 22 06:45:55 dev-master hyperkube[20051]: E0622 06:45:55.841357   20051 kubelet_node_status.go:107] Unable to register node "10.50.50.201" with API server: Post https://10.50.50.201:8001/api/v1/nodes: dial tcp 10.50.50.201:8001: getsockopt: connection refused
Jun 22 06:45:56 dev-master hyperkube[20051]: I0622 06:45:56.236826   20051 kubelet.go:1799] skipping pod synchronization - [container runtime is down]
Jun 22 06:45:56 dev-master hyperkube[20051]: I0622 06:45:56.241590   20051 kubelet_node_status.go:289] Setting node annotation to enable volume controller attach/detach
Jun 22 06:45:56 dev-master hyperkube[20051]: I0622 06:45:56.245081   20051 kubelet_node_status.go:83] Attempting to register node 10.50.50.201
Jun 22 06:45:56 dev-master hyperkube[20051]: E0622 06:45:56.245475   20051 kubelet_node_status.go:107] Unable to register node "10.50.50.201" with API server: Post https://10.50.50.201:8001/api/v1/nodes: dial tcp 10.50.50.201:8001: getsockopt: connection refused
Jun 22 06:45:56 dev-master hyperkube[20051]: E0622 06:45:56.492206   20051 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:450: Failed to list *v1.Service: Get https://10.50.50.201:8001/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.50.50.201:8001: getsockopt: connection refused
Jun 22 06:45:56 dev-master hyperkube[20051]: E0622 06:45:56.493216   20051 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://10.50.50.201:8001/api/v1/pods?fieldSelector=spec.nodeName%3D10.50.50.201&limit=500&resourceVersion=0: dial tcp 10.50.50.201:8001: getsockopt: co
Jun 22 06:45:56 dev-master hyperkube[20051]: E0622 06:45:56.494240   20051 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:459: Failed to list *v1.Node: Get https://10.50.50.201:8001/api/v1/nodes?fieldSelector=metadata.name%3D10.50.50.201&limit=500&resourceVersion=0: dial tcp 10.50.50.201:8001: getsockopt: connecti
Jun 22 06:45:57 dev-master hyperkube[20051]: I0622 06:45:57.036893   20051 kubelet.go:1799] skipping pod synchronization - [container runtime is down]
Jun 22 06:45:57 dev-master hyperkube[20051]: I0622 06:45:57.045705   20051 kubelet_node_status.go:289] Setting node annotation to enable volume controller attach/detach
Jun 22 06:45:57 dev-master hyperkube[20051]: I0622 06:45:57.047489   20051 kubelet_node_status.go:83] Attempting to register node 10.50.50.201
Jun 22 06:45:57 dev-master hyperkube[20051]: E0622 06:45:57.047787   20051 kubelet_node_status.go:107] Unable to register node "10.50.50.201" with API server: Post https://10.50.50.201:8001/api/v1/nodes: dial tcp 10.50.50.201:8001: getsockopt: connection refused
Jun 22 06:45:57 dev-master hyperkube[20051]: E0622 06:45:57.413319   20051 event.go:209] Unable to write event: 'Post https://10.50.50.201:8001/api/v1/namespaces/default/events: dial tcp 10.50.50.201:8001: getsockopt: connection refused' (may retry after sleeping)
Jun 22 06:45:57 dev-master hyperkube[20051]: E0622 06:45:57.492781   20051 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:450: Failed to list *v1.Service: Get https://10.50.50.201:8001/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.50.50.201:8001: getsockopt: connection refused
Jun 22 06:45:57 dev-master hyperkube[20051]: E0622 06:45:57.493560   20051 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://10.50.50.201:8001/api/v1/pods?fieldSelector=spec.nodeName%3D10.50.50.201&limit=500&resourceVersion=0: dial tcp 10.50.50.201:8001: getsockopt: co
Jun 22 06:45:57 dev-master hyperkube[20051]: E0622 06:45:57.494574   20051 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:459: Failed to list *v1.Node: Get https://10.50.50.201:8001/api/v1/nodes?fieldSelector=metadata.name%3D10.50.50.201&limit=500&resourceVersion=0: dial tcp 10.50.50.201:8001: getsockopt: connecti
Jun 22 06:45:57 dev-master hyperkube[20051]: W0622 06:45:57.549477   20051 manager.go:340] Could not configure a source for OOM detection, disabling OOM events: open /dev/kmsg: no such file or directory
Jun 22 06:45:57 dev-master hyperkube[20051]: I0622 06:45:57.659932   20051 kubelet_node_status.go:289] Setting node annotation to enable volume controller attach/detach
Jun 22 06:45:57 dev-master hyperkube[20051]: I0622 06:45:57.661447   20051 cpu_manager.go:155] [cpumanager] starting with none policy
Jun 22 06:45:57 dev-master hyperkube[20051]: I0622 06:45:57.661459   20051 cpu_manager.go:156] [cpumanager] reconciling every 10s
Jun 22 06:45:57 dev-master hyperkube[20051]: I0622 06:45:57.661468   20051 policy_none.go:42] [cpumanager] none policy: Start
Jun 22 06:45:57 dev-master hyperkube[20051]: W0622 06:45:57.661523   20051 fs.go:539] stat failed on /dev/loop10 with error: no such file or directory
Jun 22 06:45:57 dev-master hyperkube[20051]: F0622 06:45:57.661535   20051 kubelet.go:1359] Failed to start ContainerManager failed to get rootfs info: failed to get device for dir "/var/lib/kubelet": could not find device with major: 0, minor: 126 in cached partitions map
Jun 22 06:45:57 dev-master systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a
Jun 22 06:45:57 dev-master systemd[1]: kubelet.service: Failed with result 'exit-code'.

4 个答案:

答案 0 :(得分:2)

我在您的日志中发现了很多相同的错误消息:

dial tcp 10.50.50.201:8001: getsockopt: connection refused

可能有几个问题:

  • IP地址和/或端口不正确
  • 从工作人员到主人没有访问权限
  • 您的Master出了点问题,例如kube-apiserver关闭

您应该朝那个方向看。

答案 1 :(得分:0)

我具有相同的退出状态,但是由于max_user_watches数量的限制,我的kubelet无法启动。以下使kubelet重新工作 https://github.com/google/cadvisor/issues/1581#issuecomment-367616070

答案 2 :(得分:0)

在所有节点上运行以下命令。它对我有用。

new Date().getFullYear()

答案 3 :(得分:0)

user1188867's answer绝对正确。

我想添加一条信息,供那些不使用Ubuntu的人参考。 我在裸机上使用Clear Linux的群集上实现了这一目标。我在这里附上有关如何在这种环境下检测问题并禁用交换功能来解决问题的说明。

首先,重新启动后启动sudo systemctl status kubelet会产生以下信息:

● kubelet.service - kubelet: The Kubernetes Node Agent 
    Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
   Drop-In: /etc/systemd/system/kubelet.service.d
            └─0-cni.conf, 0-containerd.conf, 10-cgroup-driver.conf
    Active: activating (auto-restart) (Result: exit-code) since Thu 2020-12-17 11:04:37 CET; 2s ago
      Docs: http://kubernetes.io/docs/
   Process: 2404 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, st> 
   Process: 2405 ExecStartPost=/usr/bin/kubelet-version-check.sh store (code=exited, status=0/SUCCESS)
  Main PID: 2404 (code=exited, status=255/EXCEPTION)
       CPU: 683ms

问题实际上是交换文件的存在。 要禁用它:

  1. nofail选项添加到/usr/lib/systemd/system/var-swapfile.swap。我个人使用了以下一种代码:sudo sed -i s/Options=/Options=nofail,/ /usr/lib/systemd/system/var-swapfile.swap
  2. 禁用交换:sudo swapoff -a
  3. 删除交换文件:sudo rm /var/swapfile

在Clear Linux上,此过程可在重新引导后继续禁用交换。