我正在试图找出一种在失败时重启容器的方法,而不是删除并创建一个新的容器来取而代之。能够尝试重新启动它,例如3次,然后停止吊舱将是一个加号。
我有一个看起来像这样的状态(我删除了一些无关紧要的部分):
apiVersion: "apps/v1beta1"
kind: StatefulSet
metadata:
name: cassandra-stateful
spec:
serviceName: cassandra
replicas: 1
template:
metadata:
labels:
app: cassandra-stateful
spec:
# Only one Cassandra node should exist for one Kubernetes node.
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- cassandra
topologyKey: "kubernetes.io/hostname"
containers:
- name: cassandra
image: localrepo/cassandra-kube
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
lifecycle:
preStop:
exec:
command: ["pkill java && while ps -p 1 > /dev/null; do sleep 1; done"]
我知道它正在重建pod的原因是我故意用以下方法杀死我的进程:
pkill java && while ps -p 1 > /dev/null; do sleep 1; done
如果我对pod进行描述,我可以看到它重新创建容器而不是重新启动:
$ kubectl describe po cassandra-stateful-0
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
11m 11m 1 default-scheduler Normal Scheduled Successfully assigned cassandra-stateful-0 to node-136-225-226-236
11m 11m 1 kubelet, node-136-225-226-236 spec.containers{cassandra} Normal Created Created container with id cf5bbdc2989e231cdad4bb16dd26ad55b9a016200842cc3b2a3915f3d618737f
11m 11m 1 kubelet, node-136-225-226-236 spec.containers{cassandra} Normal Started Started container with id cf5bbdc2989e231cdad4bb16dd26ad55b9a016200842cc3b2a3915f3d618737f
4m 4m 1 kubelet, node-136-225-226-236 spec.containers{cassandra} Normal Created Created container with id fb4869eb91313512dc56608a6ef3d24590c88234a0ef453cd7c16dcf625e1f37
4m 4m 1 kubelet, node-136-225-226-236 spec.containers{cassandra} Normal Started Started container with id fb4869eb91313512dc56608a6ef3d24590c88234a0ef453cd7c16dcf625e1f37
是否有任何规则可以实现这一目标?