如何动态创建Luigi任务

时间:2016-08-30 00:35:35

标签: python pickle luigi

我正在为Luigi Tasks构建一个包装器,我遇到了Register类的障碍,它实际上是一个ABC元类,在创建动态type时不可用。

以下代码或多或少是我用来开发动态类的代码。

class TaskWrapper(object):
    '''Luigi Spark Factory from the provided JobClass

    Args:
        JobClass(ScrubbedClass): The job to wrap
        options: Options as passed into the JobClass
    '''

    def __new__(self, JobClass, **options):
        # Validate we have a good job
        valid_classes = (
            ScrubbedClass01,
            # ScrubbedClass02,
            # ScrubbedClass03,
        )
        if any(vc == JobClass for vc in valid_classes) or not issubclass(JobClass, valid_classes):
            raise TypeError('Job is not the correct class: {}'.format(JobClass))

        # Build a luigi task class dynamically
        luigi_identifier = 'Task'
        job_name = JobClass.__name__
        job_name = job_name.replace('Pail', '')
        if not job_name.endswith(luigi_identifier):
            job_name += luigi_identifier

        LuigiTask = type(job_name, (PySparkTask, ), {})

        for k, v in options.items():
            setattr(LuigiTask, k, luigi.Parameter())

        def main(self, sc, *args):
            job = JobClass(**options)
            return job._run()

        LuigiTask.main = main

        return LuigiTask

但是,当我运行我的调用函数时,我得到PicklingError: Can't pickle <class 'abc.ScrubbedNameTask'>: attribute lookup abc.ScrubbedNameTask failed

调用功能:

def create_task(JobClass, **options):
    LuigiTask = TaskWrapper(JobClass, **options)
    # Add parameters
    parameters = {
        d: options.get(d)
        for d in dir(LuigiTask)
        if not d.startswith('_')
        if isinstance(getattr(LuigiTask, d), luigi.Parameter)
        if d in options
    }

    task = LuigiTask(**parameters)
    return task

1 个答案:

答案 0 :(得分:0)

在使用<!DOCTYPE HTML> <html> <head> <title>Movies Demo</title> </head> <body> <table style="width:100%"> <tr> <th>Title</th> <th>Imdb-ID</th> <th>Rank</th> <th>Rating</th> <th>Rating-Count</th> </tr> </table> </body> </html> 的元类动态创建类时,模块将变为ABC,并且当工作人员尝试找到任务时,它将转到抽象基类模块并尝试找到它在那里,但它当然不存在。

要解决此问题,请确保luigi知道如何通过手动重置abc变量来找到构建类的代码。

将行更改为:

__module__

据我所知,这仅是Windows上的问题。