气流:BranchPythonOperator任务跳过后的任务

时间:2019-01-23 16:02:39

标签: airflow

我创建了一个BranchPythonOperator,它根据如下条件调用2个任务:

typicon_check_table = BranchPythonOperator(
    task_id='typicon_check_table',
    python_callable=CheckTable(),
    provide_context=True,
    dag=typicon_task_dag)

typicon_create_table = PythonOperator(
    task_id='typicon_create_table',
    python_callable=CreateTable(),
    provide_context=True,
    dag=typicon_task_dag)

typicon_load_data = PythonOperator(
    task_id='typicon_load_data',
    python_callable=LoadData(),
    provide_context=True,
    dag=typicon_task_dag)

typicon_check_table.set_downstream([typicon_load_data, typicon_create_table])
typicon_create_table.set_downstream(typicon_load_data)

这是CheckTable可调用的类:

class CheckTable:
    """
    DAG task to check if table exists or not.
    """

    def __call__(self, **kwargs) -> None:
        pg_hook = PostgresHook(postgres_conn_id="postgres_docker")
        query = "SELECT EXISTS ( \
            SELECT 1 FROM information_schema.tables \
            WHERE table_schema = 'public' \
            AND table_name = 'users');"

        table_exists = pg_hook.get_records(query)[0][0]
        if table_exists:
            return "typicon_load_data"
        return "typicon_create_table"

问题是运行typicon_check_table任务时,两个任务都被跳过。

如何解决此问题?

enter image description here

3 个答案:

答案 0 :(得分:0)

任务typicon_load_datatypicon_create_table作为父项,默认的trigger_rule为all_success,因此我对此行为并不感到惊讶。

这里有两种可能的情况:

  1. CheckTable()返回typicon_load_data,然后跳过typicon_create_table,但也跳过typicon_load_data,下游。
  2. CheckTable()返回执行的typicon_create_table,并触发typicon_load_data,因为它是被排除的分支而被跳过。

我认为您的屏幕截图来自案例1。

答案 1 :(得分:0)

如下所示,将一个trigger_rule =“ all_done”规则添加到typicon_check_table中

typicon_check_table = BranchPythonOperator(
    task_id='typicon_check_table',
    python_callable=CheckTable(),
    provide_context=True,
    trigger_rule="all_done",
    dag=typicon_task_dag)

答案 2 :(得分:0)

I have worked out with same scenario , its working fine with me for below code 

BranchPythonOperator(task_id='slot_population_on_is_y_or_n', python_callable=DAGConditionalValidation('Y'),
                         trigger_rule='one_success')
slot_population_on_is_y = DummyOperator(task_id='slot_population_on_is_y')
slot_population_on_is_n = DummyOperator(task_id='slot_population_on_is_n')
slot_population_on_is_y_or_n >> [slot_population_on_is_y, slot_population_on_is_n]


class DAGConditionalValidation:

    def __init__(self, conditional_param_key):
        self.conditional_param_key = conditional_param_key


    def __call__(self, **kwargs):
        if (conditional_param_key == 'Y'):
            return slot_population_on_is_y
        return slot_population_on_is_n


It looks all your code fine, but you're missing the trigger rule, please set trigger rule as `trigger_rule='one_success'`. This should work for you as well.