为什么kubelet比docker cli慢启动容器?

时间:2017-11-25 18:19:52

标签: kubernetes google-kubernetes-engine

我试图理解为什么我的容器中的一个容器在由kubelet启动时启动的速度比通过直接在GKE节点本身上的docker cli启动时更慢。

这是kubelet日志。容器已启动,但仍处于未准备状态23秒:

18:49:55.000 Container image "eu.gcr.io/proj/ns/myimage@sha256:fff668" already present on machine
18:49:55.000 Created container
18:49:56.000 Started container
18:49:56.000 Readiness probe failed: cat: /tmp/healthy: No such file or directory
18:49:58.000 Readiness probe failed: cat: /tmp/healthy: No such file or directory
18:50:00.000 Readiness probe failed: cat: /tmp/healthy: No such file or directory
18:50:02.000 Readiness probe failed: cat: /tmp/healthy: No such file or directory
18:50:04.000 Readiness probe failed: cat: /tmp/healthy: No such file or directory
18:50:06.000 Readiness probe failed: cat: /tmp/healthy: No such file or directory
18:50:08.000 Readiness probe failed: cat: /tmp/healthy: No such file or directory
18:50:10.000 Readiness probe failed: cat: /tmp/healthy: No such file or directory
18:50:12.000 Readiness probe failed: cat: /tmp/healthy: No such file or directory
18:50:14.000 Readiness probe failed: cat: /tmp/healthy: No such file or directory
18:50:16.000 Readiness probe failed: cat: /tmp/healthy: No such file or directory
18:50:18.000 Readiness probe failed: cat: /tmp/healthy: No such file or directory

最后,容器实际上在23秒后启动。我知道这一点,因为它首先要做的是打印以下日志行,然后为readinessProbe写入/ tmp / healthy文件。

18:50:18.000 17:50:18,572|MainThread|INFO|cli|Starting application 

但是,如下面的命令通过打印当前日期显示,然后使用docker cli启动容器(与上面运行的kubelet在同一节点上),启动容器只需约1秒钟。

mark@gke-cluster-3 ~ $ date ++%Y-%m-%d %H:%M:%S.%N; docker run -it eu.gcr.io/proj/ns/myimage@sha256:fff668
2017-11-25 16:37:01.188799045
2017-11-25 16:37:02,246|MainThread|INFO|cli|Starting application

这让我有点疯狂!关于可能导致这种情况的任何想法都欢迎:)

1 个答案:

答案 0 :(得分:1)

事实证明,启动时这些容器启动缓慢的问题限制了Python解释器的CPU。我添加了一个bash脚本,可以在启动Python进程之前打印日期时间,当更改容器可用的CPU资源时,问题变得非常明显。

cpu: 10m

2017-12-18 08:05:46,1513584346 starting script
2017-12-18 08:06:22,318|MainThread|INFO|cli|Application startup

cpu: 50m

2017-12-18 08:15:11,1513584911 starting script
2017-12-18 08:15:27,317|MainThread|INFO|cli|Application startup

cpu: 100m

2017-12-18 08:07:46,1513584466 starting script
2017-12-18 08:07:53,218|MainThread|INFO|cli|Application startup

cpu: 150m

2017-12-18 08:18:16,1513585096 starting script
2017-12-18 08:18:20,730|MainThread|INFO|cli|Application startup

cpu: 200m

2017-12-18 08:09:14,1513584554 starting script
2017-12-18 08:09:17,922|MainThread|INFO|cli|Application startup

它有点令人沮丧,因为应用程序在运行时消耗大约10米CPU。我将从此处调查模块导入和其他建议:https://lwn.net/Articles/730915/