无法使用luigi运行mapreduce

时间:2013-10-09 19:38:42

标签: spotify hadoop-streaming

我是map-reduce job的新手。可能是一些基本问题,但现有的文档并没有帮助我。 如何使用luigi运行mapreduce作业。例如wordcount_hadoop.py我需要传递什么参数才能开始工作

python examples/wordcount_hadoop.py --date-interval 2012-06-01

输出:

usage: wordcount_hadoop.py [-h] [--scheduler-port SCHEDULER_PORT] [--lock]
                           [--workers WORKERS] [--lock-pid-dir LOCK_PID_DIR]
                           [--scheduler-host SCHEDULER_HOST]
                           [--local-scheduler] [--pool POOL]
                                                    {BaseHadoopJobTask,EnvironmentParamsContainer,JobTask,Task,WordCount,WrapperTask}                           ...
wordcount_hadoop.py: error: argument {BaseHadoopJobTask,EnvironmentParamsContainer,JobTask,Task,WordCount,WrapperTask}: invalid choice: '2012-07' (choose from 'JobTask', 'Task', 'WrapperTask', 'WordCount', 'EnvironmentParamsContainer', 'BaseHadoopJobTask')

1 个答案:

答案 0 :(得分:3)

您需要在命令中传入任务名称。

例如:

python examples/wordcount_hadoop.py WordCount --date-interval 2012-06-01