kubelet在初始部署后不会注册,但需要重新启动

时间:2017-01-11 12:16:32

标签: deployment kubernetes kubelet

我有一个来自kubelet的奇怪行为,在群集被引导后不久,kubelet没有注册到API服务器。

有趣的是,如果我重新启动kubelet守护程序,它会正确注册并且一切正常,这让我相信这是一个同步问题?(我使用的是coreos,cloud config和kubelet被配置为systemd单元)

Kubernetes节点部署后不久,Kubelet日志只显示以下条目,仅此而已:

-- Logs begin at Wed 2017-01-11 10:59:51 UTC, end at Wed 2017-01-11 11:58:35 UTC. --
Jan 11 11:00:47 worker0 systemd[1]: Started Kubernetes Kubelet.
Jan 11 11:00:47 worker0 kubelet[1712]: Flag --api-servers has been deprecated, Use --kubeconfig instead. Will be removed in a future version.
Jan 11 11:00:47 worker0 kubelet[1712]: I0111 11:00:47.793484    1712 docker.go:375] Connecting to docker on unix:///var/run/docker.sock
Jan 11 11:00:47 worker0 kubelet[1712]: I0111 11:00:47.793603    1712 docker.go:395] Start docker client with request timeout=2m0s
Jan 11 11:00:47 worker0 kubelet[1712]: E0111 11:00:47.793740    1712 cni.go:163] error updating cni config: No networks found in /etc/cni/net.d
Jan 11 11:00:47 worker0 kubelet[1712]: I0111 11:00:47.804434    1712 manager.go:140] cAdvisor running in container: "/system.slice/kubelet.service"

如果我重新启动kubelet,我会看到预期的日志,并按预期向API服务器注册。重启后的kubelet日志下面:

-- Logs begin at Wed 2017-01-11 10:59:51 UTC, end at Wed 2017-01-11 11:58:35 UTC. --
Jan 11 11:00:47 worker0 systemd[1]: Started Kubernetes Kubelet.
Jan 11 11:00:47 worker0 kubelet[1712]: Flag --api-servers has been deprecated, Use --kubeconfig instead. Will be removed in a future version.
Jan 11 11:00:47 worker0 kubelet[1712]: I0111 11:00:47.793484    1712 docker.go:375] Connecting to docker on unix:///var/run/docker.sock
Jan 11 11:00:47 worker0 kubelet[1712]: I0111 11:00:47.793603    1712 docker.go:395] Start docker client with request timeout=2m0s
Jan 11 11:00:47 worker0 kubelet[1712]: E0111 11:00:47.793740    1712 cni.go:163] error updating cni config: No networks found in /etc/cni/net.d
Jan 11 11:00:47 worker0 kubelet[1712]: I0111 11:00:47.804434    1712 manager.go:140] cAdvisor running in container: "/system.slice/kubelet.service"
Jan 11 11:58:26 worker0 systemd[1]: Stopping Kubernetes Kubelet...
Jan 11 11:58:26 worker0 systemd[1]: Stopped Kubernetes Kubelet.
Jan 11 11:58:26 worker0 systemd[1]: Started Kubernetes Kubelet.
Jan 11 11:58:26 worker0 kubelet[5180]: Flag --api-servers has been deprecated, Use --kubeconfig instead. Will be removed in a future version.
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.501190    5180 docker.go:375] Connecting to docker on unix:///var/run/docker.sock
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.501525    5180 docker.go:395] Start docker client with request timeout=2m0s
Jan 11 11:58:26 worker0 kubelet[5180]: E0111 11:58:26.501775    5180 cni.go:163] error updating cni config: No networks found in /etc/cni/net.d
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.521821    5180 manager.go:140] cAdvisor running in container: "/system.slice/kubelet.service"
Jan 11 11:58:26 worker0 kubelet[5180]: W0111 11:58:26.554844    5180 manager.go:148] unable to connect to Rkt api service: rkt: cannot tcp Dial rkt api service: dial tcp 127.0.0.1:15441: ge
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.562578    5180 fs.go:116] Filesystem partitions: map[/dev/sda3:{mountpoint:/usr major:8 minor:3 fsType:ext4 blockSize:0} /dev/sda6:{mou
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.567504    5180 manager.go:195] Machine: {NumCores:2 CpuFrequency:2299998 MemoryCapacity:1045340160 MachineID:bed23c2c06d642f1904ebbe67a
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.572042    5180 manager.go:201] Version: {KernelVersion:4.7.3-coreos-r3 ContainerOsVersion:CoreOS 1185.5.0 (MoreOS) DockerVersion:1.11.2
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.574264    5180 kubelet.go:255] Adding manifest file: /opt/kubernetes/manifests
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.574340    5180 kubelet.go:265] Watching apiserver
Jan 11 11:58:26 worker0 kubelet[5180]: W0111 11:58:26.633161    5180 kubelet_network.go:71] Hairpin mode set to "promiscuous-bridge" but configureCBR0 is false, falling back to "hairpin-vet
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.633682    5180 kubelet.go:516] Hairpin mode set to "hairpin-veth"
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.641810    5180 docker_manager.go:242] Setting dockerRoot to /var/lib/docker
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.642560    5180 kubelet_network.go:306] Setting Pod CIDR:  -> 172.20.31.1/24
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.644117    5180 server.go:714] Started kubelet v1.4.0
Jan 11 11:58:26 worker0 kubelet[5180]: E0111 11:58:26.647154    5180 kubelet.go:1094] Image garbage collection failed: unable to find data for container /
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.650196    5180 kubelet_node_status.go:194] Setting node annotation to enable volume controller attach/detach
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.651955    5180 server.go:118] Starting to listen on 0.0.0.0:10250
Jan 11 11:58:26 worker0 kubelet[5180]: E0111 11:58:26.668376    5180 kubelet.go:2127] Failed to check if disk space is available for the runtime: failed to get fs info for "runtime": unable
Jan 11 11:58:26 worker0 kubelet[5180]: E0111 11:58:26.668432    5180 kubelet.go:2135] Failed to check if disk space is available on the root partition: failed to get fs info for "root": una
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.674021    5180 fs_resource_analyzer.go:66] Starting FS ResourceAnalyzer
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.674110    5180 status_manager.go:129] Starting to sync pod status with apiserver
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.674141    5180 kubelet.go:2229] Starting kubelet main sync loop.
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.674208    5180 kubelet.go:2240] skipping pod synchronization - [network state unknown container runtime is down]
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.675339    5180 volume_manager.go:234] Starting Kubelet Volume Manager
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.713597    5180 factory.go:295] Registering Docker factory
Jan 11 11:58:26 worker0 kubelet[5180]: W0111 11:58:26.717164    5180 manager.go:244] Registration of the rkt container factory failed: unable to communicate with Rkt api service: rkt: canno
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.717777    5180 factory.go:54] Registering systemd factory
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.719843    5180 factory.go:86] Registering Raw factory
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.723229    5180 manager.go:1082] Started watching for new ooms in manager
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.725579    5180 oomparser.go:185] oomparser using systemd
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.728010    5180 manager.go:285] Starting recovery of all containers
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.837552    5180 kubelet_node_status.go:194] Setting node annotation to enable volume controller attach/detach
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.878400    5180 kubelet_node_status.go:64] Attempting to register node worker0
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.919196    5180 kubelet_node_status.go:67] Successfully registered node worker0
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.924483    5180 kubelet_network.go:306] Setting Pod CIDR: 172.20.31.1/24 ->
Jan 11 11:58:27 worker0 kubelet[5180]: I0111 11:58:27.104781    5180 manager.go:290] Recovery completed

知道如何解决这类问题吗?

谢谢, 的Davide

2 个答案:

答案 0 :(得分:1)

听到等待docker启动或接口正确初始化的延迟。我发现以下问题听起来与您的问题完全相同:https://github.com/kubernetes/kubernetes/issues/33789#issuecomment-251251196

  

修复可能是添加“if configure-cbr = true AND”的条件   network-plugin = none或noop“,然后不要检查/ etc / default / docker   决定是否重启docker。

答案 1 :(得分:0)

问题原来是我的cloud-config文件:如果你使用cloud-config来同步守护进程启动顺序你不应该在守护进程单元文件中配置启动指令(如requires / after),否则你将导致云-init和systemd互相“打架”! 详情请在此处解释:to_period