为apache气流安装python依赖项

时间:2020-07-26 18:45:27

标签: airflow airflow-scheduler airflow-operator airflow-worker

我正在使用apache气流来运行我的dag。 我想安装python依赖项:requests == 2.22.0

我的用于Web服务器,调度程序和Postgres的docker compose文件是:

version: "2.1"
services:
  postgres_airflow:
    image: postgres:12
    environment:
        - POSTGRES_USER=airflow
        - POSTGRES_PASSWORD=airflow
        - POSTGRES_DB=airflow
    ports:
        - "5432:5432"

  postgres_Service:
    image: postgres:12
    environment:
        - POSTGRES_USER=developer
        - POSTGRES_PASSWORD=secret
        - POSTGRES_DB=service_db
    ports:
        - "5433:5432"
 
  scheduler:
    image: apache/airflow
    restart: always
    depends_on:
      - postgres_airflow
      - postgres_Service
      - webserver
    env_file:
      - .env
    volumes:
        - ./dags:/opt/airflow/dags
    command: scheduler
    healthcheck:
        test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-webserver.pid ]"]
        interval: 30s
        timeout: 30s
        retries: 3

  webserver:
    image: apache/airflow
    restart: always
    depends_on:
        - pg_airflow
        - pg_metadata
        - tenants-registry-api
        - metadata-api
    env_file:
      - .env
    volumes:
        - ./dags:/opt/airflow/dags
        - ./scripts:/opt/airflow/scripts
    ports:
        - "8080:8080"
    entrypoint: ./scripts/airflow-entrypoint.sh
    healthcheck:
        test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-webserver.pid ]"]
        interval: 30s
        timeout: 30s
        retries: 3

我的dag文件是:

import requests
from datetime import datetime

from airflow import DAG

from airflow.operators.python_operator import PythonOperator

default_args = {'owner': 'airflow',
                'start_date': datetime(2018, 1, 1)
                }

dag = DAG('download2',
          schedule_interval='0 * * * *',
          default_args=default_args,
          catchup=False)


def hello_world_py():
    requests.post(url)
    print('Hello World')


with dag:
    t1 = PythonOperator(
        task_id='download2',
        python_callable=hello_world_py,
        requirements=['requests==2.22.0'],
        provide_context=True,
        dag=dag
    )


我面临的问题是:

  1. 由于遇到问题,我无法使用PythonVirtualenvOperator安装要求 Airflow log file exception

  2. 我不能使用类似的东西:

    build:
      args:
        PYTHON_DEPS: "requests==2.22.0"

,因为我在上下文中没有Dockerfile。我的图像带有apavhe /气流。

  1. 我不能在initdb中使用volume mount ./requirements.txt:requirements.txt,因为我没有使用initdb容器。我只在脚本气流initdb中使用命令。

解决以上三个问题的任何方法都行得通。

0 个答案:

没有答案