如何从以前的 DAG 运行中获取 TaskInstance,而不管计划/触发运行?

时间:2021-04-29 11:10:29

标签: python airflow google-cloud-composer apache-airflow-xcom

我使用的是 Apache Airflow 1.10.12,我有一个以混合方式运行的 DAG:每天运行并在满足某些外部条件时触发。 Catchup 设置为 False。

我有一个 ShortCircuitOperator 可以拉取前一个运行任务实例,以及它的 XCom 结果,就示例而言,它是带有 get_date1 命令的 date BashOperator。拉取方式如下:

@provide_session
def validate_processing_need(**context):
    ti = context["ti"]
    previous_ti = context['ti']._get_previous_ti()
    old_result = previous_ti.xcom_pull(
        task_ids="get_date1", key="return_value"
        )

但是,当同时存在运行启动(计划和手动)时,上述方法效果不佳。示例:

1. scheduled run
 INFO - This run XCom result: Thu Apr 29 10:40:29 UTC 2021

2. triggered run
 INFO - This run XCom result: Thu Apr 29 10:42:45 UTC 2021
 INFO - Previous facts run XCom:  Thu Apr 29 10:40:29 UTC 2021 <- previous run, correct

3. second triggered run
 INFO - This run XCom result: Thu Apr 29 10:43:29 UTC 2021
 INFO - Previous facts run XCom:  Thu Apr 29 10:42:45 UTC 2021 <- previous run, correct

4. second scheduled run
 INFO - This run XCom result: Thu Apr 29 10:45:44 UTC 2021
 INFO - Previous facts run XCom:  Thu Apr 29 10:40:29 UTC 2021 <- value of the 1st scheduled run, not last one

我试过在 TaskInstance 类中重载这个函数,但是从上下文运行总是指向原始类,我也尝试过 context['ti']._get_previous_ti(state=State.SUCCESS) 但结果几乎相同。如何让此函数始终获得正确的上次运行时间,而不管它是计划的还是触发的?

0 个答案:

没有答案