我创建了一个BranchPythonOperator,它根据如下条件调用2个任务:
typicon_check_table = BranchPythonOperator(
task_id='typicon_check_table',
python_callable=CheckTable(),
provide_context=True,
dag=typicon_task_dag)
typicon_create_table = PythonOperator(
task_id='typicon_create_table',
python_callable=CreateTable(),
provide_context=True,
dag=typicon_task_dag)
typicon_load_data = PythonOperator(
task_id='typicon_load_data',
python_callable=LoadData(),
provide_context=True,
dag=typicon_task_dag)
typicon_check_table.set_downstream([typicon_load_data, typicon_create_table])
typicon_create_table.set_downstream(typicon_load_data)
这是CheckTable
可调用的类:
class CheckTable:
"""
DAG task to check if table exists or not.
"""
def __call__(self, **kwargs) -> None:
pg_hook = PostgresHook(postgres_conn_id="postgres_docker")
query = "SELECT EXISTS ( \
SELECT 1 FROM information_schema.tables \
WHERE table_schema = 'public' \
AND table_name = 'users');"
table_exists = pg_hook.get_records(query)[0][0]
if table_exists:
return "typicon_load_data"
return "typicon_create_table"
问题是运行typicon_check_table
任务时,两个任务都被跳过。
如何解决此问题?
答案 0 :(得分:0)
任务typicon_load_data
以typicon_create_table
作为父项,默认的trigger_rule为all_success
,因此我对此行为并不感到惊讶。
这里有两种可能的情况:
CheckTable()
返回typicon_load_data
,然后跳过typicon_create_table
,但也跳过typicon_load_data
,下游。CheckTable()
返回执行的typicon_create_table
,并触发typicon_load_data
,因为它是被排除的分支而被跳过。我认为您的屏幕截图来自案例1。
答案 1 :(得分:0)
如下所示,将一个trigger_rule =“ all_done”规则添加到typicon_check_table中
typicon_check_table = BranchPythonOperator(
task_id='typicon_check_table',
python_callable=CheckTable(),
provide_context=True,
trigger_rule="all_done",
dag=typicon_task_dag)
答案 2 :(得分:0)
I have worked out with same scenario , its working fine with me for below code
BranchPythonOperator(task_id='slot_population_on_is_y_or_n', python_callable=DAGConditionalValidation('Y'),
trigger_rule='one_success')
slot_population_on_is_y = DummyOperator(task_id='slot_population_on_is_y')
slot_population_on_is_n = DummyOperator(task_id='slot_population_on_is_n')
slot_population_on_is_y_or_n >> [slot_population_on_is_y, slot_population_on_is_n]
class DAGConditionalValidation:
def __init__(self, conditional_param_key):
self.conditional_param_key = conditional_param_key
def __call__(self, **kwargs):
if (conditional_param_key == 'Y'):
return slot_population_on_is_y
return slot_population_on_is_n
It looks all your code fine, but you're missing the trigger rule, please set trigger rule as `trigger_rule='one_success'`. This should work for you as well.