依次运行两个luigi WrapperTasks

时间:2019-10-04 11:38:54

标签: python pipeline luigi

import luigi as li


class TaskA(li.Task):

    def output(self):
        return li.LocalTarget('TaskA.txt')

    def run(self):
        with self.output().open('w') as outfile:
            outfile.write('DONE_A')


class TaskB(li.Task):
    required_task = li.TaskParameter()

    def output(self):
        return li.LocalTarget('TaskB.txt')

    def requires(self):
        return self.required_task

    def run(self):
        with self.output().open('w') as outfile:
            outfile.write('DONE_B')


class TaskC(li.Task):

    def output(self):
        return li.LocalTarget('TaskC.txt')

    def run(self):
        with self.output().open('w') as outfile:
            outfile.write('DONE_C')


class PipelineX(li.WrapperTask):

    def requires(self):
        task_a = TaskA()
        return TaskB(required_task=task_a)


class PipelineY(li.WrapperTask):

    def requires(self):
        return TaskC()


class AllPipelines(li.?):
    pipeline_x = li.TaskParameter(default=PipelineX())
    pipeline_y = li.TaskParameter(default=PipelineY())

    # problem: PipelineY depends on PipelineX
    # how to first run pipeline_x, wait until it finished, then
    # run pipeline_y? Afterwards AllPipelines should complete.

你好社区,

我正在寻找一种连续运行多个(当前为WrapperTasks)的方法。

我试图在上面的示例代码中分解我的问题,如果有人可以给我一些如何管理它的提示,我将非常高兴。

目标如下:

  1. 运行PipelineX
  2. 当1.为completed()时,运行PipelineY
  3. 完成所有操作后,完成AllPipelines

非常感谢大家的帮助!

最诚挚的问候

克里斯

1 个答案:

答案 0 :(得分:1)

首先,由于您说PipelineY取决于PipelineX,所以最自然的事情是在PipelineX的需求中包括PipelineY

def PipelineY(luigi.WrapperTask):
    def requires(self):
        return [PipelineY, TaskC]

但是,我敢打赌,实际上TaskC取决于PipelineY,因此您可以将PipelineY放入TaskC的依赖项中。

但是,如果您确实需要管道,而上述方法对您不起作用,则可以使用luigi的动态依赖项(https://luigi.readthedocs.io/en/stable/tasks.html#dynamic-dependencies):

def AllPipelines(luigi.Task):
    def output(self):
        return luigi.LocalTarget("success.txt")

    def run(self):
        yield PipelineX()
        yield PipelineY()
        with self.output().open('w') as out_file:
            out_file.write("1")