kubelet无法获得docker和kubelet服务的cgroup统计信息

时间:2017-10-13 09:02:38

标签: kubernetes cgroups

我在裸机 Debian 上运行 kubernetes (3名主人,2名工人,现在是PoC)。我跟着k8s-hard-way,我在我的kubelet上遇到了以下问题:

  

无法获取系统容器统计信息   “/system.slice/docker.service”:无法获取cgroup统计信息   “/system.slice/docker.service”:无法获取cgroup统计信息   “/system.slice/docker.service”:无法获取容器信息   “/system.slice/docker.service”:未知容器   “/system.slice/docker.service”

我对kubelet.service也有同样的信息。

我有一些关于这些cgroup的文件

$ ls /sys/fs/cgroup/systemd/system.slice/docker.service
cgroup.clone_children  cgroup.procs  notify_on_release  tasks

$ ls /sys/fs/cgroup/systemd/system.slice/kubelet.service/
cgroup.clone_children  cgroup.procs  notify_on_release  tasks

cadvisor 告诉我:

$ curl http://127.0.0.1:4194/validate
cAdvisor version: 

OS version: Debian GNU/Linux 8 (jessie)

Kernel version: [Supported and recommended]
    Kernel version is 3.16.0-4-amd64. Versions >= 2.6 are supported. 3.0+ are recommended.


Cgroup setup: [Supported and recommended]
    Available cgroups: map[cpu:1 memory:1 freezer:1 net_prio:1 cpuset:1 cpuacct:1 devices:1 net_cls:1 blkio:1 perf_event:1]
    Following cgroups are required: [cpu cpuacct]
    Following other cgroups are recommended: [memory blkio cpuset devices freezer]
    Hierarchical memory accounting enabled. Reported memory usage includes memory used by child containers.


Cgroup mount setup: [Supported and recommended]
    Cgroups are mounted at /sys/fs/cgroup.
    Cgroup mount directories: blkio cpu cpu,cpuacct cpuacct cpuset devices freezer memory net_cls net_cls,net_prio net_prio perf_event systemd 
    Any cgroup mount point that is detectible and accessible is supported. /sys/fs/cgroup is recommended as a standard location.
    Cgroup mounts:
    cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
    cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
    cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0
    cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
    cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
    cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
    cgroup /sys/fs/cgroup/net_cls,net_prio cgroup rw,nosuid,nodev,noexec,relatime,net_cls,net_prio 0 0
    cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
    cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0


Managed containers: 
    /kubepods/burstable/pod76099b4b-af57-11e7-9b82-fa163ea0076a
    /kubepods/besteffort/pod6ed4ee49-af53-11e7-9b82-fa163ea0076a/f9da6bf60a186c47bd704bbe3cc18b25d07d4e7034d185341a090dc3519c047a
            Namespace: docker
            Aliases:
                    k8s_tiller_tiller-deploy-cffb976df-5s6np_kube-system_6ed4ee49-af53-11e7-9b82-fa163ea0076a_1
                    f9da6bf60a186c47bd704bbe3cc18b25d07d4e7034d185341a090dc3519c047a
    /kubepods/burstable/pod76099b4b-af57-11e7-9b82-fa163ea0076a/956911118c342375abfb7a07ec3bb37451bbc64a1e141321b6284cf5049e385f

修改

禁用kubelet上的 cadvisor 端口(--cadvisor-port=0)无法解决此问题。

5 个答案:

答案 0 :(得分:20)

尝试使用

启动kubelet
--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice

我在RHEL7上使用此解决方案,使用Kubelet 1.8.0和Docker 1.12

答案 1 :(得分:10)

angeloxx的解决方法也适用于kops的AWS默认图像(k8s-1.8-debian-jessie-amd64-hvm-ebs-2017-12-02(ami-bd229ec4))

sudo vim /etc/sysconfig/kubelet

在DAEMON_ARGS字符串末尾添加:

 --runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice

最后:

sudo systemctl restart kubelet

答案 2 :(得分:1)

谢谢angeloxx!

我正在遵循kubernetes指南: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/setup-ha-etcd-with-kubeadm/

在说明中,他们请您制作文件: /usr/lib/systemd/system/kubelet.service.d/20-etcd-service-manager.conf

所在行:

ExecStart=/usr/bin/kubelet --address=127.0.0.1 --pod-manifest-path=/etc/kubernetes/manifests --cgroup-driver=systemd

我接受了您的答案并将其添加到ExecStart行的末尾:

ExecStart=/usr/bin/kubelet --address=127.0.0.1 --pod-manifest-path=/etc/kubernetes/manifests --cgroup-driver=systemd --runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice

我在写这篇文章是为了对别人有帮助

@ wolmi感谢您的编辑!

另外一个注意事项: 我上面的配置用于我的etcd集群,而不是kubernetes节点。节点上的类似20-etcd-service-manager.conf的文件将覆盖“ 10-kubeadm.conf”文件中的所有设置,从而导致各种配置丢失。将“ /var/lib/kubelet/config.yaml”文件用于节点和/或/var/lib/kubelet/kubeadm-flags.env。

答案 3 :(得分:0)

除了此更改之外,我还必须执行yum update才能使其正常运行。尝试这种解决方法的其他人可能会有所帮助。

答案 4 :(得分:0)

对于那些更进一步的人,如上所述,我必须补充:

在DAEMON_ARGS字符串末尾添加:

--runtime-cgroups=/lib/systemd/system/kubelet.service --kubelet-cgroups=/lib/systemd/system/kubelet.service

然后: sudo systemctl restart kubelet

但我发现我还在:

Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get container info for "/system.slice/docker.service": unknown container "/system.slice/docker.service"

重启dockerd解决了这个错误: sudo systemctl restart docker

由于

经过多一点挖掘后,我找到了更好的解决方案,将其添加到kops配置中:

https://github.com/kubernetes/kops/issues/4049