我已经在天蓝色的kubernetes中部署了Apache气流。
Apache气流头盔存储库:https://github.com/apache/airflow/tree/master/chart
AKS版本:1.16.13
我正在使用git-sync阅读GitHub中的数据,为此,我修改了values.yml文件:
dags:
persistence:
# Enable persistent volume for storing dags
enabled: false
# Volume size for dags
size: 1Gi
# If using a custom storageClass, pass name here
storageClassName:
# access mode of the persistent volume
accessMode: ReadWriteMany
## the name of an existing PVC to use
existingClaim: ~
gitSync:
enabled: true
# git repo clone url
# ssh examples ssh://git@github.com/apache/airflow.git
# git@github.com:apache/airflow.git
# https example: https://github.com/apache/airflow.git
repo: https://my_github_repository.git
branch: master
rev: HEAD
root: "/git"
dest: "repo"
一旦部署了气流,我就会用这个dag测试它:
from airflow import DAG
from datetime import datetime, timedelta
from airflow.contrib.operators.kubernetes_pod_operator import KubernetesPodOperator
from airflow.operators.dummy_operator import DummyOperator
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': datetime.utcnow(),
'email': ['airflow@example.com'],
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(minutes=5)
}
dag = DAG(
'kubernetes_sample', default_args=default_args, schedule_interval=timedelta(minutes=10))
start = DummyOperator(task_id='run_this_first', dag=dag)
passing = KubernetesPodOperator(namespace='default',
image="python:3.8-slim-buster",
cmds=["python3","-c"],
arguments=["print('hello world')"],
labels={"foo": "bar"},
name="passing-test",
task_id="passing-task",
get_logs=True,
dag=dag
)
passing.set_upstream(start)
工作正常。现在,我想使用自己的图像。为此,我正在使用azure容器,并遵循以下指南:https://airflow.readthedocs.io/en/latest/howto/operator/kubernetes.html,我正在使用以下代码创建秘密来访问我的azure注册表:
kubectl create secret docker-registry testquay \
--docker-server=quay.io \
--docker-username=<Profile name> \
--docker-password=<password>
我建立了映像,并在本地进行了测试,并且可以正常工作。我将图像上传到azure容器注册表,并编写了以下dag:
from airflow import DAG
from datetime import datetime, timedelta
from airflow.contrib.operators.kubernetes_pod_operator import KubernetesPodOperator
from airflow.operators.dummy_operator import DummyOperator
from kubernetes.client import models as k8s
import logging
import os
import sys
import traceback
try:
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': datetime.utcnow(),
'email': ['airflow@example.com'],
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(minutes=5)
}
dag = DAG(
'test1', default_args=default_args, schedule_interval=timedelta(minutes=10))
start = DummyOperator(task_id='run_this_first', dag=dag)
quay_k8s = KubernetesPodOperator(
namespace='default',
image='<MY_SERVER_NAME>/testingairlfowdags:latest',
image_pull_secrets=[k8s.V1LocalObjectReference('azure-registry')],
name="testingairlfowdags",
is_delete_operator_pod=True,
in_cluster=True,
task_id="task-two",
get_logs=True,
log_events_on_failure = True,
dag=dag
)
start >> quay_k8s
except Exception as e:
error_message = {
"message": "An internal error ocurred"
,"error": str(e)
, "error information" : str(sys.exc_info())
, "traceback": str(traceback.format_exc())
}
logging.info(error_message)
但是当quay_k8s任务启动时,创建了一个pod并突然被杀死,我无法获取任何日志。
初始化Pod时,Kubernetes仪表板向我显示以下内容:
但是然后: