我试图了解如何在Apache airflow中创建动态dag,因为我需要这样做才能在项目中创建动态dag。
下面是链接iam:Dynamic DAG creation in Apache airflow
下面是用于创建示例hello world动态DAGS的代码块。(基于输入参数创建动态DAG)。
from datetime import datetime
from airflow import DAG
from airflow.operators.python_operator import PythonOperator
def create_dag(dag_id,
schedule,
dag_number,
default_args):
def hello_world_py(*args):
print('Hello World')
print('This is DAG: {}'.format(str(dag_number)))
dag = DAG(dag_id,
schedule_interval=schedule,
default_args=default_args)
with dag:
t1 = PythonOperator(
task_id='hello_world',
python_callable=hello_world_py,
dag_number=dag_number)
return dag
# build a dag for each number in range(10)
for n in range(1, 10):
dag_id = 'hello_world_{}'.format(str(n))
default_args = {'owner': 'airflow',
'start_date': datetime(2018, 1, 1)
}
schedule = '@daily'
dag_number = n
globals()[dag_id] = create_dag(dag_id,
schedule,
dag_number,
default_args)
期望创建9个这样的DAG。但是我看到的是,一旦我用python3 code_sample.py
编译了上面的代码块,它就会创建9个DAG,但是嵌入在DAG中的代码是完整的示例代码。 / p>
但是,据我了解,创建的DAG应该仅具有以下代码块,该代码块在上述示例代码块的create_dag方法内可用。
预期的DAG代码:
from datetime import datetime
from airflow import DAG
from airflow.operators.python_operator import PythonOperator
def hello_world_py(*args):
print('Hello World')
print('This is DAG: {}'.format(str(dag_number)))
dag = DAG(dag_id,
schedule_interval=schedule,
default_args=default_args)
with dag:
t1 = PythonOperator(
task_id='hello_world',
python_callable=hello_world_py,
dag_number=dag_number)
实际DAG代码:
from datetime import datetime
from airflow import DAG
from airflow.operators.python_operator import PythonOperator
def create_dag(dag_id,
schedule,
dag_number,
default_args):
def hello_world_py(*args):
print('Hello World')
print('This is DAG: {}'.format(str(dag_number)))
dag = DAG(dag_id,
schedule_interval=schedule,
default_args=default_args)
with dag:
t1 = PythonOperator(
task_id='hello_world',
python_callable=hello_world_py,
dag_number=dag_number)
return dag
# build a dag for each number in range(10)
for n in range(1, 10):
dag_id = 'hello_world_{}'.format(str(n))
default_args = {'owner': 'airflow',
'start_date': datetime(2018, 1, 1)
}
schedule = '@daily'
dag_number = n
globals()[dag_id] = create_dag(dag_id,
schedule,
dag_number,
default_args)
让我知道造成上述问题的原因
答案 0 :(得分:1)
单击“代码”选项卡时在Airflow UI中看到的代码就是整个.py
文件的源代码。查看此功能的实现方式:
https://github.com/apache/airflow/blob/master/airflow/www/views.py#L437