我的vm上有一个名为simple.input
的目录。我正在尝试运行我的map reduce使用以下命令从simple.input读取的引擎
hadoop jar hadoop/share/hadoop/tools/lib/hadoop-streaming-2.7.3.jar -mapper ./fof.mapper.py -reducer fof.reducer.py -input simple.input/ -output simple.output
这是输出:
16/11/11 00:03:49 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
16/11/11 00:03:49 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
16/11/11 00:03:49 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
16/11/11 00:03:49 INFO mapreduce.JobSubmitter: Cleaning up the staging area file:/tmp/hadoop-parallel/mapred/staging/parallel2080528058/.staging/job_local2080528058_0001
16/11/11 00:03:49 ERROR streaming.StreamJob: Error Launching job : Input path does not exist: hdfs://localhost:9000/user/parallel/simple.input
Streaming Command Failed!
我已经复制了simple.input
parallel@parallel-pr3:~$ hadoop fs -copyFromLocal simple.input /
copyFromLocal: `/simple.input/100': File exists
copyFromLocal: `/simple.input/200': File exists
copyFromLocal: `/simple.input/300': File exists
copyFromLocal: `/simple.input/400': File exists
parallel@parallel-pr3:~$
答案 0 :(得分:1)
根据我的理解,你需要改变
来自:-input simple.input/
收件人:-input /simple.input/
希望这有助于!!! ....