Question

我有两个在GKE上运行的kubernetes集群，似乎三天前似乎没有任何理由重新创建了两个集群中的所有节点。我今天才发现这一点，因为我们注意到由于其中一个Pod无法重新启动而导致其中一项服务处于脱机状态。

我解决了吊舱的问题，但我仍然想知道，如何确定是否/为什么要重新创建节点？

如果我通过SSH进入节点并使用journalctl -u kubelet检查kubelet日志，则日志从3天前（重新创建节点的时间）开始。

如果我查看GCP云控制台，请在GCE / Operations下看到以下内容：

Create an instance template gke-onyx-default-pool-76ffec1b service-152001628404@container-engine-robot.iam.gserviceaccount.com
Set instance template of an instance group manager gke-onyx-default-pool-06fb264e-grp service-152001628404@container-engine-robot.iam.gserviceaccount.com
Remove instance from a target pool ae8c5f04a34fe11e9a2ea42010a9a0fd service-152001628404@container-engine-robot.iam.gserviceaccount.com     
...

对所有节点均如此。根据我的收集，GCP决定创建一个具有新节点的新实例模板，并将群集迁移到新节点。

我如何找出发生这种情况的原因？

了解为什么在GKE上重新创建了节点模板？

0 个答案: