如何为启动KubernetesPodOperator的Kubernetes Airflow工作人员容器创建kubeconfig

时间:2019-01-24 15:43:42

标签: kubernetes google-cloud-platform airflow

我正在Kubernetes Engine中设置气流,现在有以下(运行中的)吊舱:

  • postgres(已安装PersistentVolumeClaim
  • 网络(气流仪表板)
  • rabbitmq
  • 调度程序
  • 工人

我想从Airflow运行一个任务,启动一个pod,在这种情况下,该pod从SFTP服务器下载一些文件。但是,由于找不到kubeconfig,Airflow中应启动此新Pod的KubernetesPodOperator无法运行。

Airflow worker的配置如下。除了不同的args外,其他的Airflow Pod完全相同。

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: worker
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: airflow
        tier: worker
    spec:
      restartPolicy: Always
      containers:
        - name: worker
          image: my-gcp-project/kubernetes-airflow-in-container-registry:v1
          imagePullPolicy: IfNotPresent
          env:
            - name: AIRFLOW_HOME
              value: "/usr/local/airflow"
          args: ["worker"]

KubernetesPodOperator的配置如下:

maybe_download = KubernetesPodOperator(
    task_id='maybe_download_from_sftp',
    image='some/image:v1',
    namespace='default',
    name='maybe-download-from-sftp',
    arguments=['sftp_download'],
    image_pull_policy='IfNotPresent',
    dag=dag,
    trigger_rule='dummy',
)

以下错误表明Pod上没有kubeconfig。

[2019-01-24 12:37:04,706] {models.py:1789} INFO - All retries failed; marking task as FAILED
[2019-01-24 12:37:04,722] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp Traceback (most recent call last):
[2019-01-24 12:37:04,722] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp   File "/usr/local/bin/airflow", line 32, in <module>
[2019-01-24 12:37:04,722] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp     args.func(args)
[2019-01-24 12:37:04,722] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp   File "/usr/local/lib/python3.6/site-packages/airflow/utils/cli.py", line 74, in wrapper
[2019-01-24 12:37:04,722] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp     return f(*args, **kwargs)
[2019-01-24 12:37:04,722] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp   File "/usr/local/lib/python3.6/site-packages/airflow/bin/cli.py", line 490, in run
[2019-01-24 12:37:04,722] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp     _run(args, dag, ti)
[2019-01-24 12:37:04,722] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp   File "/usr/local/lib/python3.6/site-packages/airflow/bin/cli.py", line 406, in _run
[2019-01-24 12:37:04,722] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp     pool=args.pool,
[2019-01-24 12:37:04,722] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp   File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 74, in wrapper
[2019-01-24 12:37:04,722] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp     return func(*args, **kwargs)
[2019-01-24 12:37:04,722] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp   File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 1659, in _run_raw_task
[2019-01-24 12:37:04,723] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp     result = task_copy.execute(context=context)
[2019-01-24 12:37:04,723] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp   File "/usr/local/lib/python3.6/site-packages/airflow/contrib/operators/kubernetes_pod_operator.py", line 90, in execute
[2019-01-24 12:37:04,723] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp     config_file=self.config_file)
[2019-01-24 12:37:04,723] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp   File "/usr/local/lib/python3.6/site-packages/airflow/contrib/kubernetes/kube_client.py", line 51, in get_kube_client
[2019-01-24 12:37:04,723] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp     return _load_kube_config(in_cluster, cluster_context, config_file)
[2019-01-24 12:37:04,723] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp   File "/usr/local/lib/python3.6/site-packages/airflow/contrib/kubernetes/kube_client.py", line 38, in _load_kube_config
[2019-01-24 12:37:04,723] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp     config.load_kube_config(config_file=config_file, context=cluster_context)
[2019-01-24 12:37:04,723] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp   File "/usr/local/airflow/.local/lib/python3.6/site-packages/kubernetes/config/kube_config.py", line 537, inload_kube_config
[2019-01-24 12:37:04,723] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp     config_persister=config_persister)
[2019-01-24 12:37:04,723] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp   File "/usr/local/airflow/.local/lib/python3.6/site-packages/kubernetes/config/kube_config.py", line 494, in_get_kube_config_loader_for_yaml_file
[2019-01-24 12:37:04,723] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp     with open(filename) as f:
[2019-01-24 12:37:04,723] {base_task_runner.py:101} INFO - Job 8: Subtask maybe_download_from_sftp FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/airflow/.kube/config'
[2019-01-24 12:37:08,300] {logging_mixin.py:95} INFO - [2019-01-24 12:37:08,299] {jobs.py:2627} INFO - Task exited with return code 1

我想启动Pod,并“自动”包含它所在的Kubernetes集群的上下文-如果可以的话。我觉得我缺少基本的东西。有人可以帮忙吗?

1 个答案:

答案 0 :(得分:1)

The Fine Manual中所述,您将希望in_cluster=True告知KPO,它实际上是集群内的。

我实际上建议向Airflow提交错误,因为Airflow可以轻松检测到它在集群中运行的事实,并且默认情况下要比您的经验更为合理。