如何使用ExternalTask​​Sensor在气流中设置两个DAG?

时间:2018-08-24 11:28:18

标签: triggers dependencies airflow directed-acyclic-graphs

我有两个DAG:

DAG_CPS

dag = DAG(
  'DAG_CPS',
  default_args=default_args,
  dagrun_timeout=timedelta(hours=2),
  schedule_interval=None,
  max_active_runs=1
) 
tmp1_cap_pes_sap = PostgresOperatorWithTemplatedParams(
task_id='tmp1_cap_pes_sap',
sql='./SQL/A2050.sql',
postgres_conn_id='xxxx',
dag=dag) 
...

DAG_SAS

dag = DAG(
'DAG_SAS',
default_args=default_args,
dagrun_timeout=timedelta(hours=2),
schedule_interval=None,
max_active_runs=1
)

wait_for_DAG_CPS = ExternalTaskSensor(
task_id='wait_for_DAG_CPS',
external_dag_id='DAG_CPS',
external_task_id='tmp1_cap_pes_sap',
execution_delta=None,
execution_date_fn=None,
dag=dag)

我从网络上手动触发了两个DAG,任务tmp1_cap_pes_sap结束了

Attribute       Value
dag_id          DAG_CPS
duration        None
end_date        2018-08-24 11:04:28.177221
execution_date  2018-08-24 11:04:18.113031

但是在DAG_SAS中,我获得了下一个日志,它永远不会启动

[2018-08-24 11:03:55,592] {base_task_runner.py:98} INFO - Subtask: [2018-08-24 11:03:55,592] {sensors.py:243} INFO - Poking for DAG_CPS.tmp1_cap_pes_sap on 2018-08-24T11:03:50.518595 ... 
[2018-08-24 11:04:55,642] {base_task_runner.py:98} INFO - Subtask: [2018-08-24 11:04:55,641] {sensors.py:243} INFO - Poking for DAG_CPS.tmp1_cap_pes_sap on 2018-08-24T11:03:50.518595 ... 
[2018-08-24 11:05:55,718] {base_task_runner.py:98} INFO - Subtask: [2018-08-24 11:05:55,717] {sensors.py:243} INFO - Poking for DAG_CPS.tmp1_cap_pes_sap on 2018-08-24T11:03:50.518595 ... 
[2018-08-24 11:06:55,799] {base_task_runner.py:98} INFO - Subtask: [2018-08-24 11:06:55,797] {sensors.py:243} INFO - Poking for DAG_CPS.tmp1_cap_pes_sap on 2018-08-24T11:03:50.518595 ... 
[2018-08-24 11:07:55,853] {base_task_runner.py:98} INFO - Subtask: [2018-08-24 11:07:55,853] {sensors.py:243} INFO - Poking for DAG_CPS.tmp1_cap_pes_sap on 2018-08-24T11:03:50.518595 ... 

我的代码有什么问题?

已解决

感谢@Alessandro Cosentino帮助我。这是修复后的代码,基本上,如果我手动启动DAG,它将永远无法工作

DAG_CPS

default_args = {
'depends_on_past': False,
'start_date': airflow.utils.dates.days_ago(2),
'retries': 2,
'retry_delay': timedelta(minutes=1)
}

dag = DAG(
'DAG_CPS',
default_args=default_args,
dagrun_timeout=timedelta(minutes=5),
schedule_interval='*/10 * * * *',
max_active_runs=1
)

DAG_SAS

dag = DAG(
'DAG_SAS',
default_args=default_args,
dagrun_timeout=timedelta(minutes=5),
schedule_interval='*/10 * * * *',
max_active_runs=1
)

1 个答案:

答案 0 :(得分:1)

由于您是手动触发任务,因此它们将以不同的+----+---------+------------+ | ID | Y-M | Difference | +----+---------+------------+ | 1 | 2017-01 | No | | 1 | 2017-02 | Yes | | 1 | 2017-10 | Yes | | 2 | 2017-04 | Yes | | 2 | 2017-11 | No | | 2 | 2017-12 | Yes | | 3 | 2017-06 | Yes | | 4 | 2017-07 | Yes | +----+---------+------------+ 运行,这就是ExternalTask​​Sensor无法检测到第一个DAG任务完成的原因。

尝试按相同的时间表运行它们,然后查看是否可行。

我认为这是问题所在,因为存在execution_dateexecution_delta参数,实际上存在着两个DAG同步的参数。有关这两个参数的行为,请参见the docs