一天多次执行dag时的气流运行日期问题

时间:2020-07-23 12:59:11

标签: python airflow

我正在尝试使用DAG Args中的schedule_interval参数将dag设置为每天运行四次。正如预期的那样,DAG在一天中开始执行了4次。但是,我注意到执行日期出了点问题。例如:

原始设置为:

default_args = {
    "task_id": PARENT_DAG_NAME,
    "owner": Variable.get("OWNER"),
    "depends_on_past": False,
    "catchup": False,
    "email_on_failure": False,
    "email_on_retry": False,
    "retries": 0,
    "concurrency": 1,
    "start_date":datetime(2019, 9, 6)
}

main_dag = DAG(
    dag_id=PARENT_DAG_NAME,
    schedule_interval="05 11 * * *",
    start_date=datetime(2019, 9, 6),
    default_args=default_args,
)

执行死刑时我已经

Clock Time: 2020-07-20 11:05:00

Run: 2020-07-19T11:05:00+00:00
run_id: scheduled__2020-07-19T11:05:00+00:00
Started: 2020-07-20T11:05:00+00:00

您注意到Run / run_id与开始日期之间的时差大约是开始日期提前1天的一天

另一方面,当我将其更改为每天4个执行程序时,如下所示:

default_args = {
    "task_id": PARENT_DAG_NAME,
    "owner": Variable.get("OWNER"),
    "depends_on_past": False,
    "catchup": False,
    "email_on_failure": False,
    "email_on_retry": False,
    "retries": 0,
    "concurrency": 1,
    "start_date":datetime(2019, 9, 6)
}

main_dag = DAG(
    dag_id=PARENT_DAG_NAME,
    schedule_interval="05 11/6 * * *",
    start_date=datetime(2019, 9, 6),
    default_args=default_args,
)

运行是:

Clock Time: 2020-07-22 11:05:00

Run: 2020-07-22T05:05:00+00:00
run_id: scheduled__2020-07-22T05:05:00+00:00
Started: 2020-07-22T11:05:00+00:00

Clock Time: 2020-07-22 17:05:00

Run: 2020-07-22T11:05:00+00:00
run_id: scheduled__2020-07-22T11:05:00+00:00
Started: 2020-07-22T17:05:00+00:00

Clock Time: 2020-07-22 23:05:00

Run: 2020-07-22T17:05:00+00:00
run_id: scheduled__2020-07-22T17:05:00+00:00
Started: 2020-07-22T23:05:00+00:00

Clock Time: 2020-07-23 05:05:00

Run: 2020-07-22T23:05:00+00:00
run_id: scheduled__2020-07-22T23:05:00+00:00
Started: 2020-07-23T05:05:00+00:00

您现在看到的Run / Run_id与开始日期之间的时差为6小时。

这在DAG内造成了问题,因为它使用execution_date作为执行变量。我当时以为,多执行员最初的设置将有1天的差异,但这并没有发生。

理想是:

Clock Time: 2020-07-22 05:05:00

Run: 2020-07-21T05:05:00+00:00
run_id: scheduled__2020-07-21T05:05:00+00:00
Started: 2020-07-22T05:05:00+00:00

Clock Time: 2020-07-22 11:05:00

Run: 2020-07-21T11:05:00+00:00
run_id: scheduled__2020-07-21T11:05:00+00:00
Started: 2020-07-22T11:05:00+00:00

Clock Time: 2020-07-22 17:05:00

Run: 2020-07-21T17:05:00+00:00
run_id: scheduled__2020-07-21T17:05:00+00:00
Started: 2020-07-22T17:05:00+00:00

Clock Time: 2020-07-22 23:05:00

Run: 2020-07-21T23:05:00+00:00
run_id: scheduled__2020-07-21T23:05:00+00:00
Started: 2020-07-22T23:05:00+00:00

我应该更改DAG设置中的某些内容还是仅仅是气流的工作方式?

0 个答案:

没有答案
相关问题