我正在gcloud的Kubernetes集群中运行django应用程序。我将数据库迁移实现为头盔预安装挂钩,该挂钩启动了我的应用容器并进行了数据库迁移。我按照官方教程https://cloud.google.com/sql/docs/mysql/connect-kubernetes-engine
的建议在sidecar模式下使用cloud-sql-proxy基本上,这将启动我的应用程序,并在作业描述的容器内启动cloud-sql-proxy容器。问题是在我的应用完成迁移后,cloud-sql-proxy永远不会终止,从而导致预安装作业超时并取消我的部署。我的应用程序容器完成后如何优雅地退出cloud-sql-proxy容器,以便作业可以完成?
这是我的头盔预集成挂钩模板定义:
apiVersion: batch/v1
kind: Job
metadata:
name: database-migration-job
labels:
app.kubernetes.io/managed-by: {{ .Release.Service | quote }}
app.kubernetes.io/instance: {{ .Release.Name | quote }}
app.kubernetes.io/version: {{ .Chart.AppVersion }}
helm.sh/chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
annotations:
# This is what defines this resource as a hook. Without this line, the
# job is considered part of the release.
"helm.sh/hook": pre-install,pre-upgrade
"helm.sh/hook-weight": "-1"
"helm.sh/hook-delete-policy": hook-succeeded,hook-failed
spec:
activeDeadlineSeconds: 230
template:
metadata:
name: "{{ .Release.Name }}"
labels:
app.kubernetes.io/managed-by: {{ .Release.Service | quote }}
app.kubernetes.io/instance: {{ .Release.Name | quote }}
helm.sh/chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
spec:
restartPolicy: Never
containers:
- name: db-migrate
image: {{ .Values.my-project.docker_repo }}{{ .Values.backend.image }}:{{ .Values.my-project.image.tag}}
imagePullPolicy: {{ .Values.my-project.image.pullPolicy }}
env:
- name: DJANGO_SETTINGS_MODULE
value: "{{ .Values.backend.django_settings_module }}"
- name: SENDGRID_API_KEY
valueFrom:
secretKeyRef:
name: sendgrid-api-key
key: sendgrid-api-key
- name: DJANGO_SECRET_KEY
valueFrom:
secretKeyRef:
name: django-secret-key
key: django-secret-key
- name: DB_USER
value: {{ .Values.postgresql.postgresqlUsername }}
- name: DB_PASSWORD
{{- if .Values.postgresql.enabled }}
value: {{ .Values.postgresql.postgresqlPassword }}
{{- else }}
valueFrom:
secretKeyRef:
name: database-password
key: database-pwd
{{- end }}
- name: DB_NAME
value: {{ .Values.postgresql.postgresqlDatabase }}
- name: DB_HOST
{{- if .Values.postgresql.enabled }}
value: "postgresql"
{{- else }}
value: "127.0.0.1"
{{- end }}
workingDir: /app-root
command: ["/bin/sh"]
args: ["-c", "python manage.py migrate --no-input"]
{{- if eq .Values.postgresql.enabled false }}
- name: cloud-sql-proxy
image: gcr.io/cloudsql-docker/gce-proxy:1.17
command:
- "/cloud_sql_proxy"
- "-instances=<INSTANCE_CONNECTION_NAME>=tcp:<DB_PORT>"
- "-credential_file=/secrets/service_account.json"
securityContext:
#fsGroup: 65532
runAsNonRoot: true
runAsUser: 65532
volumeMounts:
- name: db-con-mnt
mountPath: /secrets/
readOnly: true
volumes:
- name: db-con-mnt
secret:
secretName: db-service-account-credentials
{{- end }}
有趣的是,如果在迁移完成后我用“ kubectl删除作业database-migration-job”杀死了作业,则头盔升级完成并且我的新应用程序版本已安装。
答案 0 :(得分:2)
好吧,我有一个可行的解决方案,但可能有点笨拙。首先,这是issue中讨论的Kubernetes缺少的功能。
使用Kubernetes v1.17,容器位于同一Pods can share process namespaces中。这使我们能够从应用程序容器中杀死代理容器。由于这是Kubernetes的工作,因此应用容器不应该与enable postStop handlers有任何异常。
使用此解决方案,当您的应用程序完成并正常(或异常)退出时,Kubernetes将从您即将死去的容器中运行最后一个命令,在这种情况下为kill another process
。这将导致工作成功完成或失败,具体取决于您将如何终止进程。进程出口代码将是容器出口代码,然后基本上是作业出口代码。