我已经在Kubernetes中创建了一个{日程表(Cronjob
)8 * * * *
,作业的backoffLimit
默认为6,吊舱的RestartPolicy
为Never
, Pod被故意配置为FAIL。据我了解,(对于带有restartPolicy : Never
的podSpec),作业控制器将尝试创建backoffLimit
个Pod,然后将其标记为Failed
,因此,我希望会有6个Error
状态下的广告连播。
这是工作的实际状态:
status:
conditions:
- lastProbeTime: 2019-02-20T05:11:58Z
lastTransitionTime: 2019-02-20T05:11:58Z
message: Job has reached the specified backoff limit
reason: BackoffLimitExceeded
status: "True"
type: Failed
failed: 5
为什么只有5个失败的吊舱而不是6个?还是我对backoffLimit
的理解不正确?
答案 0 :(得分:3)
在将作业视为失败之前,使用spec.backoffLimit
指定重试次数。默认情况下,回退限制设置为6。
答案 1 :(得分:0)
简而言之:您可能看不到所有已创建的吊舱,因为cronjob中的计划时间很短。
如documentation中所述:
与作业关联的失败Pod由作业重新创建 带有指数补偿延迟(10s,20s,40s…)的控制器 在六分钟。如果没有新的失败Pod,将重置退避计数 出现在作业的下一个状态检查之前。
如果在Job控制器有机会重新创建Pod之前安排了新作业(请牢记先前故障后的延迟),则Job控制器将从一开始重新计数。
我通过以下.yaml
在GKE中转载了您的问题:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hellocron
spec:
schedule: "*/3 * * * *" #Runs every 3 minutes
jobTemplate:
spec:
template:
spec:
containers:
- name: hellocron
image: busybox
args:
- /bin/cat
- /etc/os
restartPolicy: Never
backoffLimit: 6
suspend: false
此作业将失败,因为文件/etc/os
不存在。
这是其中一个作业的kubectl describe
输出:
Name: hellocron-1551194280
Namespace: default
Selector: controller-uid=b81cdfb8-39d9-11e9-9eb7-42010a9c00d0
Labels: controller-uid=b81cdfb8-39d9-11e9-9eb7-42010a9c00d0
job-name=hellocron-1551194280
Annotations: <none>
Controlled By: CronJob/hellocron
Parallelism: 1
Completions: 1
Start Time: Tue, 26 Feb 2019 16:18:07 +0100
Pods Statuses: 0 Running / 0 Succeeded / 6 Failed
Pod Template:
Labels: controller-uid=b81cdfb8-39d9-11e9-9eb7-42010a9c00d0
job-name=hellocron-1551194280
Containers:
hellocron:
Image: busybox
Port: <none>
Host Port: <none>
Args:
/bin/cat
/etc/os
Environment: <none>
Mounts: <none>
Volumes: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 26m job-controller Created pod: hellocron-1551194280-4lf6h
Normal SuccessfulCreate 26m job-controller Created pod: hellocron-1551194280-85khk
Normal SuccessfulCreate 26m job-controller Created pod: hellocron-1551194280-wrktb
Normal SuccessfulCreate 26m job-controller Created pod: hellocron-1551194280-6942s
Normal SuccessfulCreate 25m job-controller Created pod: hellocron-1551194280-662zv
Normal SuccessfulCreate 22m job-controller Created pod: hellocron-1551194280-6c6rh
Warning BackoffLimitExceeded 17m job-controller Job has reached the specified backoff limit
请注意,在容器hellocron-1551194280-662zv
和hellocron-1551194280-6c6rh
创建之间的延迟。