我有8组任务。每个集合都是一系列任务:task1 >> task2 >> task3。 task3依赖于task2,因此task2依赖于task1。
我的问题是,在所有task1完成之前,task2永远不会启动。 因此,要启动set1.task2,必须先运行set8.task1。
我最初的研究是关于priority_weight
的,可以包含在DAG的default_args中。我了解到task1的下游优先级较高。
有没有办法使所有优先权重都可以相同。这样set1.task2就可以启动,而与set2,3等无关,因为它仅取决于set1.task1。
谢谢!
答案 0 :(得分:1)
将weight_rule
设置为“上游”或“绝对”应该会有所帮助。这来自BaseOperator
文档字符串:
:param weight_rule: weighting method used for the effective total
priority weight of the task. Options are:
``{ downstream | upstream | absolute }`` default is ``downstream``
When set to ``downstream`` the effective weight of the task is the
aggregate sum of all downstream descendants. As a result, upstream
tasks will have higher weight and will be scheduled more aggressively
when using positive weight values. This is useful when you have
multiple dag run instances and desire to have all upstream tasks to
complete for all runs before each dag can continue processing
downstream tasks. When set to ``upstream`` the effective weight is the
aggregate sum of all upstream ancestors. This is the opposite where
downtream tasks have higher weight and will be scheduled more
aggressively when using positive weight values. This is useful when you
have multiple dag run instances and prefer to have each dag complete
before starting upstream tasks of other dags. When set to
``absolute``, the effective weight is the exact ``priority_weight``
specified without additional weighting. You may want to do this when
you know exactly what priority weight each task should have.
Additionally, when set to ``absolute``, there is bonus effect of
significantly speeding up the task creation process as for very large
DAGS. Options can be set as string or using the constants defined in
the static class ``airflow.utils.WeightRule``
链接:https://github.com/apache/airflow/blob/master/airflow/models/baseoperator.py#L129-L150