我是map-reduce job的新手。可能是一些基本问题,但现有的文档并没有帮助我。 如何使用luigi运行mapreduce作业。例如wordcount_hadoop.py我需要传递什么参数才能开始工作
python examples/wordcount_hadoop.py --date-interval 2012-06-01
输出:
usage: wordcount_hadoop.py [-h] [--scheduler-port SCHEDULER_PORT] [--lock]
[--workers WORKERS] [--lock-pid-dir LOCK_PID_DIR]
[--scheduler-host SCHEDULER_HOST]
[--local-scheduler] [--pool POOL]
{BaseHadoopJobTask,EnvironmentParamsContainer,JobTask,Task,WordCount,WrapperTask} ...
wordcount_hadoop.py: error: argument {BaseHadoopJobTask,EnvironmentParamsContainer,JobTask,Task,WordCount,WrapperTask}: invalid choice: '2012-07' (choose from 'JobTask', 'Task', 'WrapperTask', 'WordCount', 'EnvironmentParamsContainer', 'BaseHadoopJobTask')
答案 0 :(得分:3)
您需要在命令中传入任务名称。
例如:
python examples/wordcount_hadoop.py WordCount --date-interval 2012-06-01