无法使用Luigi链接任务

时间:2019-07-31 20:21:35

标签: python pipeline luigi

我正在尝试使用Luigi软件包链接多个任务。这些任务将一个csv拆分为特定url字符串唯一的独立csv。但是,我遇到了麻烦,编写并运行包装程序以链接已定义的任务

#categorise unique user actions into separate csv files
import luigi
import pandas as pd
import os
#if the user read a blog post
class blog_readers(luigi.Task):
    def run(self):
        read_blog = full_file[full_file['properties_url'].str.contains('blog',
                regex=False)]
        read_blog.to_csv('/Users/emmanuels/Documents/GitHub/Springboard-DSC/Springboard-DSC/Capstone 1 - Attribution Model/Data/blog_readers.csv')
#if user logged in
class logged_in(luigi.Task):
    def run(self):
        logged_in = full_file[full_file['properties_url'].str.contains('login',regex=False)]
        logged_in.to_csv('/Users/emmanuels/Documents/GitHub/Springboard-DSC/Springboard-DSC/Capstone 1 - Attribution Model/Data/logged_in.csv')
#chaining tasks with wrapper
class workflow(luigi.WrapperTask):
    take_file = luigi.Parameter()   
    def requires(self):
        return [
            blog_readers(full_file = self.take_file),
            logged_in(full_file = self.take_file),
            ]
if __name__ == '__main__':
    luigi.run(main_task_cls =workflow)

我假设这种情况下的包装器链接了我先前创建的任务。但是,当我在终端上运行此命令

python cleanup.py --local-scheduler workflow --task_file '/Users/emmanuels/Desktop/attributiondata.csv'

这是我得到的完整回溯

Traceback (most recent call last):
  File "cleanup.py", line 57, in <module>
    luigi.run()
  File "/Users/emmanuels/anaconda3/lib/python3.7/site-packages/luigi/interface.py", line 216, in run
    return _run(*args, **kwargs)['success']
  File "/Users/emmanuels/anaconda3/lib/python3.7/site-packages/luigi/interface.py", line 243, in _run
    with CmdlineParser.global_instance(cmdline_args) as cp:
  File "/Users/emmanuels/anaconda3/lib/python3.7/contextlib.py", line 112,in __enter__
    return next(self.gen)
  File "/Users/emmanuels/anaconda3/lib/python3.7/site-packages/luigi/cmdline_parser.py", line 52, in global_instance
    new_value = CmdlineParser(cmdline_args)
  File "/Users/emmanuels/anaconda3/lib/python3.7/site-packages/luigi/cmdline_parser.py", line 76, in __init__
    Register.get_task_cls(root_task)
  File "/Users/emmanuels/anaconda3/lib/python3.7/site-packages/luigi/task_register.py", line 179, in get_task_cls
    raise TaskClassNotFoundException(cls._missing_task_msg(name))
luigi.task_register.TaskClassNotFoundException: No task workflow. Did you mean: worker

0 个答案:

没有答案