气流从第一个任务开始

时间:2020-06-02 09:02:35

标签: airflow

因此,我将通过分支创建每日备份,还原和删除的DAG

  1. 每天备份还原和删除(始终运行)
  2. 如果在星期六进行每周备份,还原和删除
  3. 如果每月第一天进行每月备份,还原和删除

这是周期

    daily_backup_op.set_downstream(daily_restore_op)
    daily_restore_op.set_downstream(daily_delete_op)

    daily_delete_op.set_downstream(branching_op)
    branching_op.set_downstream([weekly_backup_op,monthly_backup_op,monthly_weekly_backup_op,end_task_op])

    weekly_backup_op.set_downstream(weekly_restore_op)
    weekly_restore_op.set_downstream(weekly_delete_op)

    monthly_backup_op.set_downstream(monthly_restore_op)
    monthly_restore_op.set_downstream(monthly_delete_op)

    monthly_weekly_backup_op.set_downstream(monthly_weekly_restore_op)
    monthly_weekly_restore_op.set_downstream(monthly_weekly_delete_op)

    weekly_delete_op.set_downstream(end_task_op)
    monthly_delete_op.set_downstream(end_task_op)
    monthly_weekly_delete_op.set_downstream(end_task_op)

当我出于某种原因对其进行测试并且结果为“ else:return'end'”时,它将无法到达end_task_op,而是继续从daily_backup_op开始重新启动

这是分支代码

def backup_restore_condition():
    if((DATE_TODAY == get_first_day_of_month()) and (get_weekday_number() == SATURDAY_WEEKDAY_NUMBER)):
        return 'monthly_weekly_backup'
    elif(DATE_TODAY == get_first_day_of_month()):
        return 'monthly_backup'
    elif(get_weekday_number() == SATURDAY_WEEKDAY_NUMBER):
        return 'weekly_backup'
    else:
        return 'end'

对于end_task_op

end_task_op = PythonOperator(
        task_id='end',
        python_callable=end_task
    )
def end_task():
    print("task end succesfully")

全周期图像 DAG image

我错过了什么吗?

1 个答案:

答案 0 :(得分:1)

在end_task_op中,您必须将trigger_rule定义为NONE_FAILED,它表示“在所有父任务成功或跳过时运行任务”。 正如哈维尔(Javier)所建议的那样,最好使用位移合成,因为它增加了可读性并减少了代码行。请参阅下面的DAG,该DAG在所有情况下均适用:

from airflow import DAG
from airflow.operators.python_operator import PythonOperator, BranchPythonOperator
from airflow.utils.trigger_rule import TriggerRule
import datetime as dt

args = {
    'owner': 'airflow',
    'start_date': '2020-06-02'
}

dag = DAG(
    'backup_restore_delete_db',
    schedule_interval="@daily",
    default_args=args
)
def daily_backup():
    print('daily backup')
def daily_restore():
    print('restore daily backup')
def daily_delete():
    print('delete 7 days old backup')
def weekly_backup():
    print('weekly backup')
def weekly_restore():
    print('restore weekly backup')
def weekly_delete():
    print('delete 7 week old backup')
def monthly_backup():
    print('monthly backup')
def monthly_restore():
    print('restore monthly backup')
def monthly_delete():
    print('delete 7 month old backup')
def monthly_weekly_backup():
    print('monthlyweekly backup')
def monthly_weekly_restore():
    print('restore monthlyweekly backup')
def monthly_weekly_delete():
    print('delete 7 monthlyweekly old backup')
def end_task():
    print("task end succesfully")

def branch_func():
    if (dt.date.today().strftime("%d") == "01") and (dt.date.today().isoweekday() == 6):
        return 'monthly_weekly_backup'
    elif (dt.date.today().strftime("%d") == "01"):
        return 'monthly_backup'
    elif (dt.date.today().isoweekday() == 6):
        return 'weekly_backup'
    else:
        return 'end_task'

daily_backup_op = PythonOperator(
        task_id='daily_backup',
        python_callable=daily_backup,
        dag=dag
    )
weekly_backup_op = PythonOperator(
        task_id='weekly_backup',
        python_callable=weekly_backup,
        dag=dag
    )

monthly_backup_op = PythonOperator(
        task_id='monthly_backup',
        python_callable=monthly_backup,
        dag=dag
    )
daily_restore_op = PythonOperator(
        task_id='daily_restore',
        python_callable=daily_restore,
        dag=dag
    )

weekly_restore_op = PythonOperator(
        task_id='weekly_restore',
        python_callable=weekly_restore,
        dag=dag
    )

monthly_restore_op = PythonOperator(
        task_id='monthly_restore',
        python_callable=monthly_restore,
        dag=dag
    )

daily_delete_op = PythonOperator(
        task_id='daily_delete',
        python_callable=daily_delete,
        dag=dag
    )

weekly_delete_op = PythonOperator(
        task_id='weekly_delete',
        python_callable=weekly_delete,
        dag=dag
    )

monthly_delete_op = PythonOperator(
        task_id='monthly_delete',
        python_callable=monthly_delete,
        dag=dag
    )

monthly_weekly_backup_op = PythonOperator(
        task_id='monthly_weekly_backup',
        python_callable=monthly_weekly_backup,
        dag=dag
    )

monthly_weekly_restore_op = PythonOperator(
        task_id='monthly_weekly_restore',
        python_callable=monthly_weekly_restore,
        dag=dag 
    )

monthly_weekly_delete_op = PythonOperator(
        task_id='monthly_weekly_delete',
        python_callable=monthly_weekly_delete,
        dag=dag 
    )

end_task_op = PythonOperator(
        task_id='end_task',
        python_callable=end_task,
        trigger_rule=TriggerRule.NONE_FAILED,
        dag=dag
    )

branching_op = BranchPythonOperator(
        task_id='branch_func',
        python_callable=branch_func,
        dag=dag
    )

daily_backup_op >> daily_restore_op >> daily_delete_op >> branching_op
branching_op >> monthly_weekly_backup_op >> monthly_weekly_restore_op >> monthly_weekly_delete_op >> end_task_op
branching_op >> monthly_backup_op >> monthly_restore_op >> monthly_delete_op >> end_task_op
branching_op >> weekly_backup_op >> weekly_restore_op >> weekly_delete_op >> end_task_op
branching_op >> end_task_op