如何为依赖于另一个任务的任务设置相等的priority_weight

时间:2019-05-21 07:15:13

标签: airflow airflow-scheduler

我有8组任务。每个集合都是一系列任务:task1 >> task2 >> task3。 task3依赖于task2,因此task2依赖于task1。

我的问题是,在所有task1完成之前,task2永远不会启动。 因此,要启动set1.task2,必须先运行set8.task1。

我最初的研究是关于priority_weight的,可以包含在DAG的default_args中。我了解到task1的下游优先级较高。

有没有办法使所有优先权重都可以相同。这样set1.task2就可以启动,而与set2,3等无关,因为它仅取决于set1.task1。

谢谢!

1 个答案:

答案 0 :(得分:1)

weight_rule设置为“上游”或“绝对”应该会有所帮助。这来自BaseOperator文档字符串:

:param weight_rule: weighting method used for the effective total
    priority weight of the task. Options are:
    ``{ downstream | upstream | absolute }`` default is ``downstream``
    When set to ``downstream`` the effective weight of the task is the
    aggregate sum of all downstream descendants. As a result, upstream
    tasks will have higher weight and will be scheduled more aggressively
    when using positive weight values. This is useful when you have
    multiple dag run instances and desire to have all upstream tasks to
    complete for all runs before each dag can continue processing
    downstream tasks. When set to ``upstream`` the effective weight is the
    aggregate sum of all upstream ancestors. This is the opposite where
    downtream tasks have higher weight and will be scheduled more
    aggressively when using positive weight values. This is useful when you
    have multiple dag run instances and prefer to have each dag complete
    before starting upstream tasks of other dags.  When set to
    ``absolute``, the effective weight is the exact ``priority_weight``
    specified without additional weighting. You may want to do this when
    you know exactly what priority weight each task should have.
    Additionally, when set to ``absolute``, there is bonus effect of
    significantly speeding up the task creation process as for very large
    DAGS. Options can be set as string or using the constants defined in
    the static class ``airflow.utils.WeightRule``

链接:https://github.com/apache/airflow/blob/master/airflow/models/baseoperator.py#L129-L150