在重新启动的Airflow TaskInstance中恢复状态

时间:2017-10-04 15:52:57

标签: airflow apache-airflow

我有一个Airflow操作员,可以在第三方服务上执行作业,然后监视该作业的进度。在代码中,执行看起来像

def execute(self, context):
    external_id = start_external_job()
    wait_until_external_job_completes(external_id)

如果在运行此任务的实例时重新启动Airflow工作程序(通常是由于代码部署),我希望重新启动的该任务实例能够在前一个任务停止的地方进行检测(监视第三方服务的工作)。有没有办法在同一任务实例的后续运行中共享第三方作业ID?

增强执行方法的示例如下所示:

def execute(self, context):
    external_id = load_external_id_for_task_instance()
    if external_id is None:
        external_id = start_external_job(args)
        persist_external_id_for_task_instance(external_id)

    wait_until_external_job_completes(external_id)

我需要实施load_external_id_for_task_instancepersist_external_id_for_task_instance

1 个答案:

答案 0 :(得分:2)

我建议使用XComsSensors将其拆分为两个任务。

您可以让一个运营商提交作业并将ID保存到XCom:

class SubmitJobOperator(BaseOperator):

    def execute(self, context):
        external_id = start_external_job()
        return external_id  # return value will be stored in XCom

然后是一个传感器,它从XCom获取id并轮询直到完成:

class JobCompleteSensor(BaseSensor):

    @apply_defaults
    def __init__(self, submit_task_id, *args, **kwargs):
        self.submit_task_id = submit_task_id  # so we know where to fetch XCom value from
        super(JobCompleteSensor, self).__init__(*args, **kwargs)

    def poke(self, context):
        external_id = context['task_instance'].xcom_pull(task_ids=self.submit_task_id)
        return check_if_external_job_is_complete(external_id):

所以你的DAG看起来像这样:

submit_job = SubmitJobOperator(
    dag=dag,
    task_id='submit_job',
)

wait_for_job_to_complete = JobCompleteSensor(
    dag=dag,
    task_id='wait_for_job_to_complete',
    submit_task_id=submit_job.task_id,
)

submit_job >> wait_for_job_to_complete

XCom会保留在数据库中,以便传感器始终能够找到之前提交的external_id