我使用的是 Apache Airflow 1.10.12,我有一个以混合方式运行的 DAG:每天运行并在满足某些外部条件时触发。 Catchup 设置为 False。
我有一个 ShortCircuitOperator 可以拉取前一个运行任务实例,以及它的 XCom 结果,就示例而言,它是带有 get_date1
命令的 date
BashOperator。拉取方式如下:
@provide_session
def validate_processing_need(**context):
ti = context["ti"]
previous_ti = context['ti']._get_previous_ti()
old_result = previous_ti.xcom_pull(
task_ids="get_date1", key="return_value"
)
但是,当同时存在运行启动(计划和手动)时,上述方法效果不佳。示例:
1. scheduled run
INFO - This run XCom result: Thu Apr 29 10:40:29 UTC 2021
2. triggered run
INFO - This run XCom result: Thu Apr 29 10:42:45 UTC 2021
INFO - Previous facts run XCom: Thu Apr 29 10:40:29 UTC 2021 <- previous run, correct
3. second triggered run
INFO - This run XCom result: Thu Apr 29 10:43:29 UTC 2021
INFO - Previous facts run XCom: Thu Apr 29 10:42:45 UTC 2021 <- previous run, correct
4. second scheduled run
INFO - This run XCom result: Thu Apr 29 10:45:44 UTC 2021
INFO - Previous facts run XCom: Thu Apr 29 10:40:29 UTC 2021 <- value of the 1st scheduled run, not last one
我试过在 TaskInstance 类中重载这个函数,但是从上下文运行总是指向原始类,我也尝试过
context['ti']._get_previous_ti(state=State.SUCCESS)
但结果几乎相同。如何让此函数始终获得正确的上次运行时间,而不管它是计划的还是触发的?