我正在测试 pod 出错时的 kubernetes 行为。
我现在有一个 pod 处于 CrashLoopBackOff 状态,是由 liveness probe failed 导致的,从我在 kubernetes 事件中看到的,pod 在尝试 3 次后变成 CrashLoopBackOff 并开始回退重启,但相关的 Liveness probe failed 事件不会'更新?
➜ ~ kubectl describe pods/my-nginx-liveness-err-59fb55cf4d-c6p8l
Name: my-nginx-liveness-err-59fb55cf4d-c6p8l
Namespace: default
Priority: 0
Node: minikube/192.168.99.100
Start Time: Thu, 15 Jul 2021 12:29:16 +0800
Labels: pod-template-hash=59fb55cf4d
run=my-nginx-liveness-err
Annotations: <none>
Status: Running
IP: 172.17.0.3
IPs:
IP: 172.17.0.3
Controlled By: ReplicaSet/my-nginx-liveness-err-59fb55cf4d
Containers:
my-nginx-liveness-err:
Container ID: docker://edc363b76811fdb1ccacdc553d8de77e9d7455bb0d0fb3cff43eafcd12ee8a92
Image: nginx
Image ID: docker-pullable://nginx@sha256:353c20f74d9b6aee359f30e8e4f69c3d7eaea2f610681c4a95849a2fd7c497f9
Port: 80/TCP
Host Port: 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 15 Jul 2021 13:01:36 +0800
Finished: Thu, 15 Jul 2021 13:02:06 +0800
Ready: False
Restart Count: 15
Liveness: http-get http://:8080/ delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-r7mh4 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-r7mh4:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 37m default-scheduler Successfully assigned default/my-nginx-liveness-err-59fb55cf4d-c6p8l to minikube
Normal Created 35m (x4 over 37m) kubelet Created container my-nginx-liveness-err
Normal Started 35m (x4 over 37m) kubelet Started container my-nginx-liveness-err
Normal Killing 35m (x3 over 36m) kubelet Container my-nginx-liveness-err failed liveness probe, will be restarted
Normal Pulled 31m (x7 over 37m) kubelet Container image "nginx" already present on machine
Warning Unhealthy 16m (x32 over 36m) kubelet Liveness probe failed: Get "http://172.17.0.3:8080/": dial tcp 172.17.0.3:8080: connect: connection refused
Warning BackOff 118s (x134 over 34m) kubelet Back-off restarting failed container
BackOff 事件在 118s 前更新,但 Unhealthy 事件在 16m 前更新?
为什么我只有 15 次 Restart Count 而 BackOff 事件有 134 次?
我正在使用 minikube,我的部署是这样的:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-nginx-liveness-err
spec:
selector:
matchLabels:
run: my-nginx-liveness-err
replicas: 1
template:
metadata:
labels:
run: my-nginx-liveness-err
spec:
containers:
- name: my-nginx-liveness-err
image: nginx
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
livenessProbe:
httpGet:
path: /
port: 8080
答案 0 :(得分:1)
我认为您可能混淆了状态条件和事件。事件不会“更新”,它们只是存在。它是来自控制器的用于调试或警报的事件数据流。 Age
列是该事件类型的最新实例的相对时间戳,您可以查看是否进行了一些基本的重复数据删除。事件也会在几个小时后过期,以防止数据库爆炸。
所以您的问题与活性探针无关,您的容器在启动时崩溃。