Ubuntu 16.04 LTS,Docker 17.12.1,Kubernetes 1.10.0
Kubelet无法启动:
6月22日06:45:57 dev-master systemd [1]:kubelet.service:主进程已退出,代码已退出,状态为255 / n / a
6月22日06:45:57 dev-master systemd [1]:kubelet.service:失败,返回结果为“退出代码”。
注意:v1.9.1没问题
日志:
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.518085 20051 docker_service.go:249] Docker Info: &{ID:WDJK:3BCI:BGCM:VNF3:SXGW:XO5G:KJ3Z:EKIH:XGP7:XJGG:LFBL:YWAJ Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStopped:0 Images:1 Driver:btrfs DriverStatus:[[Build Version Btrfs v4.15.1] [Library Vers
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.521232 20051 docker_service.go:262] Setting cgroupDriver to cgroupfs
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.532834 20051 remote_runtime.go:43] Connecting to runtime service unix:///var/run/dockershim.sock
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.533812 20051 kuberuntime_manager.go:186] Container runtime docker initialized, version: 18.05.0-ce, apiVersion: 1.37.0
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.534071 20051 csi_plugin.go:61] kubernetes.io/csi: plugin initializing...
Jun 22 06:45:55 dev-master hyperkube[20051]: W0622 06:45:55.534846 20051 kubelet.go:903] Accelerators feature is deprecated and will be removed in v1.11. Please use device plugins instead. They can be enabled using the DevicePlugins feature gate.
Jun 22 06:45:55 dev-master hyperkube[20051]: W0622 06:45:55.535035 20051 kubelet.go:909] GPU manager init error: couldn't get a handle to the library: unable to open a handle to the library, GPU feature is disabled.
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.535082 20051 server.go:129] Starting to listen on 0.0.0.0:10250
Jun 22 06:45:55 dev-master hyperkube[20051]: E0622 06:45:55.535164 20051 kubelet.go:1282] Image garbage collection failed once. Stats initialization may not have completed yet: failed to get imageFs info: unable to find data for container /
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.535189 20051 server.go:944] Started kubelet
Jun 22 06:45:55 dev-master hyperkube[20051]: E0622 06:45:55.535555 20051 event.go:209] Unable to write event: 'Post https://10.50.50.201:8001/api/v1/namespaces/default/events: dial tcp 10.50.50.201:8001: getsockopt: connection refused' (may retry after sleeping)
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.535825 20051 fs_resource_analyzer.go:66] Starting FS ResourceAnalyzer
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.536202 20051 status_manager.go:140] Starting to sync pod status with apiserver
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.536253 20051 kubelet.go:1782] Starting kubelet main sync loop.
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.536285 20051 kubelet.go:1799] skipping pod synchronization - [container runtime is down PLEG is not healthy: pleg was last seen active 2562047h47m16.854775807s ago; threshold is 3m0s]
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.536464 20051 volume_manager.go:247] Starting Kubelet Volume Manager
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.536613 20051 desired_state_of_world_populator.go:129] Desired state populator starts to run
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.538574 20051 server.go:299] Adding debug handlers to kubelet server.
Jun 22 06:45:55 dev-master hyperkube[20051]: W0622 06:45:55.538664 20051 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
Jun 22 06:45:55 dev-master hyperkube[20051]: E0622 06:45:55.539199 20051 kubelet.go:2130] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.636465 20051 kubelet.go:1799] skipping pod synchronization - [container runtime is down]
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.636795 20051 kubelet_node_status.go:289] Setting node annotation to enable volume controller attach/detach
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.638630 20051 kubelet_node_status.go:83] Attempting to register node 10.50.50.201
Jun 22 06:45:55 dev-master hyperkube[20051]: E0622 06:45:55.638954 20051 kubelet_node_status.go:107] Unable to register node "10.50.50.201" with API server: Post https://10.50.50.201:8001/api/v1/nodes: dial tcp 10.50.50.201:8001: getsockopt: connection refused
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.836686 20051 kubelet.go:1799] skipping pod synchronization - [container runtime is down]
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.839219 20051 kubelet_node_status.go:289] Setting node annotation to enable volume controller attach/detach
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.841028 20051 kubelet_node_status.go:83] Attempting to register node 10.50.50.201
Jun 22 06:45:55 dev-master hyperkube[20051]: E0622 06:45:55.841357 20051 kubelet_node_status.go:107] Unable to register node "10.50.50.201" with API server: Post https://10.50.50.201:8001/api/v1/nodes: dial tcp 10.50.50.201:8001: getsockopt: connection refused
Jun 22 06:45:56 dev-master hyperkube[20051]: I0622 06:45:56.236826 20051 kubelet.go:1799] skipping pod synchronization - [container runtime is down]
Jun 22 06:45:56 dev-master hyperkube[20051]: I0622 06:45:56.241590 20051 kubelet_node_status.go:289] Setting node annotation to enable volume controller attach/detach
Jun 22 06:45:56 dev-master hyperkube[20051]: I0622 06:45:56.245081 20051 kubelet_node_status.go:83] Attempting to register node 10.50.50.201
Jun 22 06:45:56 dev-master hyperkube[20051]: E0622 06:45:56.245475 20051 kubelet_node_status.go:107] Unable to register node "10.50.50.201" with API server: Post https://10.50.50.201:8001/api/v1/nodes: dial tcp 10.50.50.201:8001: getsockopt: connection refused
Jun 22 06:45:56 dev-master hyperkube[20051]: E0622 06:45:56.492206 20051 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:450: Failed to list *v1.Service: Get https://10.50.50.201:8001/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.50.50.201:8001: getsockopt: connection refused
Jun 22 06:45:56 dev-master hyperkube[20051]: E0622 06:45:56.493216 20051 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://10.50.50.201:8001/api/v1/pods?fieldSelector=spec.nodeName%3D10.50.50.201&limit=500&resourceVersion=0: dial tcp 10.50.50.201:8001: getsockopt: co
Jun 22 06:45:56 dev-master hyperkube[20051]: E0622 06:45:56.494240 20051 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:459: Failed to list *v1.Node: Get https://10.50.50.201:8001/api/v1/nodes?fieldSelector=metadata.name%3D10.50.50.201&limit=500&resourceVersion=0: dial tcp 10.50.50.201:8001: getsockopt: connecti
Jun 22 06:45:57 dev-master hyperkube[20051]: I0622 06:45:57.036893 20051 kubelet.go:1799] skipping pod synchronization - [container runtime is down]
Jun 22 06:45:57 dev-master hyperkube[20051]: I0622 06:45:57.045705 20051 kubelet_node_status.go:289] Setting node annotation to enable volume controller attach/detach
Jun 22 06:45:57 dev-master hyperkube[20051]: I0622 06:45:57.047489 20051 kubelet_node_status.go:83] Attempting to register node 10.50.50.201
Jun 22 06:45:57 dev-master hyperkube[20051]: E0622 06:45:57.047787 20051 kubelet_node_status.go:107] Unable to register node "10.50.50.201" with API server: Post https://10.50.50.201:8001/api/v1/nodes: dial tcp 10.50.50.201:8001: getsockopt: connection refused
Jun 22 06:45:57 dev-master hyperkube[20051]: E0622 06:45:57.413319 20051 event.go:209] Unable to write event: 'Post https://10.50.50.201:8001/api/v1/namespaces/default/events: dial tcp 10.50.50.201:8001: getsockopt: connection refused' (may retry after sleeping)
Jun 22 06:45:57 dev-master hyperkube[20051]: E0622 06:45:57.492781 20051 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:450: Failed to list *v1.Service: Get https://10.50.50.201:8001/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.50.50.201:8001: getsockopt: connection refused
Jun 22 06:45:57 dev-master hyperkube[20051]: E0622 06:45:57.493560 20051 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://10.50.50.201:8001/api/v1/pods?fieldSelector=spec.nodeName%3D10.50.50.201&limit=500&resourceVersion=0: dial tcp 10.50.50.201:8001: getsockopt: co
Jun 22 06:45:57 dev-master hyperkube[20051]: E0622 06:45:57.494574 20051 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:459: Failed to list *v1.Node: Get https://10.50.50.201:8001/api/v1/nodes?fieldSelector=metadata.name%3D10.50.50.201&limit=500&resourceVersion=0: dial tcp 10.50.50.201:8001: getsockopt: connecti
Jun 22 06:45:57 dev-master hyperkube[20051]: W0622 06:45:57.549477 20051 manager.go:340] Could not configure a source for OOM detection, disabling OOM events: open /dev/kmsg: no such file or directory
Jun 22 06:45:57 dev-master hyperkube[20051]: I0622 06:45:57.659932 20051 kubelet_node_status.go:289] Setting node annotation to enable volume controller attach/detach
Jun 22 06:45:57 dev-master hyperkube[20051]: I0622 06:45:57.661447 20051 cpu_manager.go:155] [cpumanager] starting with none policy
Jun 22 06:45:57 dev-master hyperkube[20051]: I0622 06:45:57.661459 20051 cpu_manager.go:156] [cpumanager] reconciling every 10s
Jun 22 06:45:57 dev-master hyperkube[20051]: I0622 06:45:57.661468 20051 policy_none.go:42] [cpumanager] none policy: Start
Jun 22 06:45:57 dev-master hyperkube[20051]: W0622 06:45:57.661523 20051 fs.go:539] stat failed on /dev/loop10 with error: no such file or directory
Jun 22 06:45:57 dev-master hyperkube[20051]: F0622 06:45:57.661535 20051 kubelet.go:1359] Failed to start ContainerManager failed to get rootfs info: failed to get device for dir "/var/lib/kubelet": could not find device with major: 0, minor: 126 in cached partitions map
Jun 22 06:45:57 dev-master systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a
Jun 22 06:45:57 dev-master systemd[1]: kubelet.service: Failed with result 'exit-code'.
答案 0 :(得分:2)
我在您的日志中发现了很多相同的错误消息:
dial tcp 10.50.50.201:8001: getsockopt: connection refused
可能有几个问题:
您应该朝那个方向看。
答案 1 :(得分:0)
我具有相同的退出状态,但是由于max_user_watches数量的限制,我的kubelet无法启动。以下使kubelet重新工作 https://github.com/google/cadvisor/issues/1581#issuecomment-367616070
答案 2 :(得分:0)
在所有节点上运行以下命令。它对我有用。
new Date().getFullYear()
答案 3 :(得分:0)
user1188867's answer绝对正确。
我想添加一条信息,供那些不使用Ubuntu的人参考。 我在裸机上使用Clear Linux的群集上实现了这一目标。我在这里附上有关如何在这种环境下检测问题并禁用交换功能来解决问题的说明。
首先,重新启动后启动sudo systemctl status kubelet
会产生以下信息:
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─0-cni.conf, 0-containerd.conf, 10-cgroup-driver.conf
Active: activating (auto-restart) (Result: exit-code) since Thu 2020-12-17 11:04:37 CET; 2s ago
Docs: http://kubernetes.io/docs/
Process: 2404 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, st>
Process: 2405 ExecStartPost=/usr/bin/kubelet-version-check.sh store (code=exited, status=0/SUCCESS)
Main PID: 2404 (code=exited, status=255/EXCEPTION)
CPU: 683ms
问题实际上是交换文件的存在。 要禁用它:
nofail
选项添加到/usr/lib/systemd/system/var-swapfile.swap
。我个人使用了以下一种代码:sudo sed -i s/Options=/Options=nofail,/ /usr/lib/systemd/system/var-swapfile.swap
sudo swapoff -a
sudo rm /var/swapfile
在Clear Linux上,此过程可在重新引导后继续禁用交换。