luigi-如何在文件之间而不是任务之间创建依赖关系? (或如何不涉及输出方法)

时间:2019-05-15 08:16:43

标签: python python-2.7 luigi

给出两个luigi任务,我如何才能将一个添加为另一个任务的要求,以一种方式,如果完成了要求,那么第二个任务就可以启动而没有任何输出?

当前,我得到 RuntimeError:运行时未满足的依赖关系:MyTask ___ home _... ,即使任务已正常完成,因为我的requires / output方法配置不正确...

class ShellTask(ExternalProgramTask):
    """
    ExternalProgramTask's subclass dedicated for one task with the capture output ability.

    Args:
        shell_cmd (str): The shell command to be run in a subprocess.
        capture_output (bool, optional): If True the output is not displayed to console,
                                         and printed after the task is done via 
                                         logger.info (both stdout + stderr).
                                         Defaults to True.
    """
    shell_cmd = luigi.Parameter()
    requirement = luigi.Parameter(default='')
    succeeded = False

    def on_success(self):
        self.succeeded = True

    def requires(self):
        return eval(self.requirement) if self.requirement else None

    def program_args(self):
        """
        Must be implemented in an ExternalProgramTask subclass.
        Returns:
            A script that would be run in a subprocess.Popen.
        Args:
            shell_cmd (luigi.Parameter (str)): the shell command to be passed as args
                                               to the run method (run should not be overridden!).
        """
        return self.shell_cmd.split()


class MyTask(ShellTask):
    """
    Args:    if __name__ == '__main__':
    clean_output_files(['_.txt'])
    task = MyTask(
            shell_cmd='...',
            requirement="MyTask(shell_cmd='...', output_file='_.txt')",
            )
    """
    pass

if __name__ == '__main__':
    task_0 = MyTask(
            shell_cmd='...',
            requirement="MyTask(shell_cmd='...')",
            )
    luigi.build([task_0], workers=2, local_scheduler=False)

我希望使用on_success可以提示调用者任务有一些提示,但我不知道该怎么做。

我目前正在通过以下方式克服这一点:

0) implement the output method based on the input of the task (much like the eval(requirement) I did
2) implement the run method (calling the super run and then writing "ok" to output
3) deleting the output files from main.
4) calling it somehitng like this:

if __name__ == '__main__':
    clean_output_files(['_.txt'])
    task = MyTask(
            shell_cmd='...',
            requirement="MyTask(shell_cmd='...', output_file='_.txt')",
            )

1 个答案:

答案 0 :(得分:0)

因此在您的第一个Luigi任务中,您可以通过将其作为要求来调用第二个任务。

例如:

class TaskB(luigi.Task):
  def __init__(self, *args, **kwargs):
      super().__init__(*args, **kwargs)
      self.complete_flag = False
  def run(self):
      self.complete_flag = True
      print('do something')

   def complete(self):
      return self.is_complete

class TaskA(luigi.Task):
   def requires(self):
      return TaskB()

   def run(self):
      print('Carry on with other logic')