Question

主题：在工作节点（或由于内存使用情况而挂起的节点）中停止Kubelet服务使MYSQL无法在Kubernetes工作节点中正确终止。

使用的存储空间：rook-ceph

方案1：在运行MYSQL POD的工作程序节点中停止Kubelet服务。

初始状态：

root@master01:~/apps/mysql# kubectl get nodes
NAME       STATUS   ROLES    AGE   VERSION
master01   Ready    master   9d    v1.18.5
master02   Ready    master   9d    v1.18.5
master03   Ready    master   9d    v1.18.5
worker01   Ready    <none>   9d    v1.18.5
worker02   Ready    <none>   9d    v1.18.5
worker03   Ready    <none>   9d    v1.18.5
worker04   Ready    <none>   9d    v1.18.5

root@master01:~/apps/mysql# kubectl get po -o wide 
NAME                     READY   STATUS    RESTARTS   AGE   IP          NODE       NOMINATED NODE   READINESS GATES
mysql-747d4cd75c-zk7mr   1/1     Running   0          16s   10.0.5.62   worker01   <none>           <none>

root@master01:~/apps/mysql# kubectl get deployment
NAME    READY   UP-TO-DATE   AVAILABLE   AGE
mysql   1/1     1            1           2m2s

测试：使用“ #systemctl stop kubelet”关闭Kubelet服务。

root@master01:~# kubectl get nodes 
NAME       STATUS     ROLES    AGE   VERSION
master01   Ready      master   9d    v1.18.5
master02   Ready      master   9d    v1.18.5
master03   Ready      master   9d    v1.18.5
worker01   NotReady   <none>   9d    v1.18.5
worker02   Ready      <none>   9d    v1.18.5
worker03   Ready      <none>   9d    v1.18.5
worker04   Ready      <none>   9d    v1.18.5

期望：按照节点亲缘关系配置将POD重新安排到worker03。

结果：POD已在worker03中成功重新安排，但由于mysql pod内的旧容器未正确终止，因此即使新的pod正在运行，它仍显示错误日志。

root@worker01:~# docker ps | grep mysql
2d063f8d04e4        6e17b5012353           "/usr/local/bin/dock…"   9 minutes ago       Up 9 minutes                            k8s_mysql_mysql-747d4cd75c-zk7mr_251d7d3d-1201-4757-993d-a3c7d65f87b9_0
b6d2cab4ba2b        k8s.gcr.io/pause:3.2   "/pause"                 9 minutes ago       Up 9 minutes                            k8s_POD_mysql-747d4cd75c-zk7mr_251d7d3d-1201-4757-993d-a3c7d65f87b9_0

root@master01:~/apps/mysql# kubectl get po -o wide --watch
NAME                     READY   STATUS              RESTARTS   AGE     IP          NODE       NOMINATED NODE   READINESS GATES
mysql-747d4cd75c-hznns   0/1     ContainerCreating   0          3s      <none>      worker03   <none>           <none>
mysql-747d4cd75c-zk7mr   1/1     Terminating         0          4m22s   10.0.5.62   worker01   <none>           <none>
mysql-747d4cd75c-hznns   0/1     ContainerCreating   0          9s      <none>      worker03   <none>           <none>
mysql-747d4cd75c-hznns   1/1     Running             0          10s     10.0.19.93   worker03   <none>           <none>

root@master01:~/apps/mysql# kubectl logs -f mysql-747d4cd75c-hznns 
2020-07-16 09:35:33+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.29-1debian10 started.
2020-07-16 09:35:34+00:00 [Note] [Entrypoint]: Switching to dedicated user 'mysql'
2020-07-16 09:35:34+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.29-1debian10 started.
2020-07-16T09:35:35.271995Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2020-07-16T09:35:35.307021Z 0 [Note] mysqld (mysqld 5.7.29) starting as process 1 ...
2020-07-16T09:35:35.464486Z 0 [Note] InnoDB: PUNCH HOLE support available
2020-07-16T09:35:35.464514Z 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2020-07-16T09:35:35.464521Z 0 [Note] InnoDB: Uses event mutexes
2020-07-16T09:35:35.464526Z 0 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2020-07-16T09:35:35.464531Z 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2020-07-16T09:35:35.464535Z 0 [Note] InnoDB: Using Linux native AIO
2020-07-16T09:35:35.464897Z 0 [Note] InnoDB: Number of pools: 1
2020-07-16T09:35:35.465055Z 0 [Note] InnoDB: Using CPU crc32 instructions
2020-07-16T09:35:35.466999Z 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
2020-07-16T09:35:35.476381Z 0 [Note] InnoDB: Completed initialization of buffer pool
2020-07-16T09:35:35.479090Z 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2020-07-16T09:35:35.511655Z 0 [ERROR] InnoDB: Unable to lock ./ibdata1 error: 11
2020-07-16T09:35:35.511714Z 0 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
2020-07-16T09:35:35.511723Z 0 [Note] InnoDB: Retrying to lock the first data file
2020-07-16T09:35:36.514159Z 0 [ERROR] InnoDB: Unable to lock ./ibdata1 error: 11
2020-07-16T09:35:36.514213Z 0 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
2020-07-16T09:35:37.518876Z 0 [ERROR] InnoDB: Unable to lock ./ibdata1 error: 11
2020-07-16T09:35:37.518916Z 0 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
2020-07-16T09:35:38.523434Z 0 [ERROR] InnoDB: Unable to lock ./ibdata1 error: 11
2020-07-16T09:35:38.523486Z 0 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
2020-07-16T09:35:39.526138Z 0 [ERROR] InnoDB: Unable to lock ./ibdata1 error: 11
2020-07-16T09:35:39.526191Z 0 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
2020-07-16T09:35:40.530406Z 0 [ERROR] InnoDB: Unable to lock ./ibdata1 error: 11

已尝试解决方法：

当worker01上的Kubelet服务再次返回时，不当终止的mysql被删除，而worker03中的新pod可以访问该文件。
在worker01节点上的内存已满期间，仅在恢复内存后才能解决此问题。

Answer 1

吊舱处于终止状态的原因是，应该正常终止吊舱的kubelet已停止。

在工作节点上停止kubelet服务之后，应从API服务器中删除该工作节点，以清理在该节点上运行的终止pod。这样可以将Pod安排在其他可用的工作程序节点上。

kubectl delete node nodename

kubelet一旦在该工作节点上再次运行，它将自动在API服务器上注册该节点。

作为最佳做法，广告连播应使用taint based eviction，以便根据tolerationSeconds和诸如以下条件

逐出广告连播

node.kubernetes.io/not-ready：节点尚未准备好。这对应于NodeCondition Ready为“ False”

在工作节点（或由于内存使用而挂起的节点）中停止Kubelet服务使MYSQL无法在Kubernetes工作节点中正确终止

1 个答案: