K8S Pod Readiness Probe失败,连接被拒绝,但是Pod正在处理请求

时间:2020-06-02 15:16:08

标签: kubernetes openshift

我很难理解为什么Pod Readiness探针失败。

  Warning  Unhealthy  21m (x2 over 21m)  kubelet, REDACTED  Readiness probe failed: Get http://192.168.209.74:8081/actuator/health: dial tcp 192.168.209.74:8081: connect: connection refused

如果我执行此pod(或者实际上执行该应用程序的其他程序),则可以对该URL进行卷曲,而不会出现问题:

kubectl exec -it REDACTED-l2z5w /bin/bash
$ curl -v http://192.168.209.74:8081/actuator/health
$ curl -v http://192.168.209.74:8081/actuator/health
* Expire in 0 ms for 6 (transfer 0x5611b949ff50)
*   Trying 192.168.209.74...
* TCP_NODELAY set
* Expire in 200 ms for 4 (transfer 0x5611b949ff50)
* Connected to 192.168.209.74 (192.168.209.74) port 8081 (#0)
> GET /actuator/health HTTP/1.1
> Host: 192.168.209.74:8081
> User-Agent: curl/7.64.0
> Accept: */*
> 
< HTTP/1.1 200 
< Set-Cookie: CM_SESSIONID=E62390F0FF8C26D51C767835988AC690; Path=/; HttpOnly
< X-Content-Type-Options: nosniff
< X-XSS-Protection: 1; mode=block
< Cache-Control: no-cache, no-store, max-age=0, must-revalidate
< Pragma: no-cache
< Expires: 0
< X-Frame-Options: DENY
< Content-Type: application/vnd.spring-boot.actuator.v3+json
< Transfer-Encoding: chunked
< Date: Tue, 02 Jun 2020 15:07:21 GMT
< 
* Connection #0 to host 192.168.209.74 left intact
{"status":"UP",...REDACTED..}

我从Mac上的Docker-for-Desktop k8s集群以及OpenShift集群中都得到了这种行为。

就绪探针在kubectl描述中显示如下:

    Readiness:  http-get http://:8081/actuator/health delay=20s timeout=3s period=5s #success=1 #failure=10

掌舵图可以对其进行配置:

    readinessProbe:
      failureThreshold: 10
      httpGet:
        path: /actuator/health
        port: 8081
        scheme: HTTP
      initialDelaySeconds: 20
      periodSeconds: 5
      successThreshold: 1
      timeoutSeconds: 3

我不能完全排除应归咎于HTTP代理设置,但k8s文档说,从v1.13开始,HTTP_PROXY被忽略了所有检查,因此不应在本地进行。

OpenShift k8s版本是1.11,本地版本是1.16。

1 个答案:

答案 0 :(得分:2)

描述事件总是显示您要检查的资源上的最后一个事件。事实是,记录readinessProbe时记录的最后一个事件是错误。

我在实验室中使用以下Pod清单进行了测试:

apiVersion: v1
kind: Pod
metadata:
  name: readiness-exec
spec:
  containers:
  - name: readiness
    image: k8s.gcr.io/busybox
    args:
    - /bin/sh
    - -c
    - sleep 30; touch /tmp/healthy; sleep 600
    readinessProbe:
      exec:
        command:
        - cat
        - /tmp/healthy
      initialDelaySeconds: 5
      periodSeconds: 5

可以看到,文件/tmp/healthy将在30秒后在窗格中创建,而readinessProbe将在5秒后检查文件是否存在,并每5秒重复一次检查。 / p>

描述此吊舱将使我得到:

Events:
  Type     Reason     Age                    From                 Message
  ----     ------     ----                   ----                 -------
  Normal   Scheduled  7m56s                  default-scheduler    Successfully assigned default/readiness-exec to yaki-118-2
  Normal   Pulling    7m55s                  kubelet, yaki-118-2  Pulling image "k8s.gcr.io/busybox"
  Normal   Pulled     7m55s                  kubelet, yaki-118-2  Successfully pulled image "k8s.gcr.io/busybox"
  Normal   Created    7m55s                  kubelet, yaki-118-2  Created container readiness
  Normal   Started    7m55s                  kubelet, yaki-118-2  Started container readiness
  Warning  Unhealthy  7m25s (x6 over 7m50s)  kubelet, yaki-118-2  Readiness probe failed: cat: can't open '/tmp/healthy': No such file or directory

readinessProbe查找文件6次没有成功,这完全正确,因为我将其配置为每5秒检查一次,并在30秒后创建了文件。

您认为有问题的实际上是预期的行为。您的活动告诉您readinessProbe在21分钟前未检查。实际上,这意味着您的豆荚自21分钟前就健康了。