我正在使用Hadoop学习Map-reduce,我正在运行此命令:
hadoop jar /usr/lib/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.7.2.jar -mapper mapper.py -reducer reducer.py -file mapper.py -file reducer.py -input sales_data -output salesout
我包含了我得到的完整错误输出:
16/04/15 00:39:26 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead.
packageJobJar: [mapper.py, reducer.py] [] /tmp/streamjob4183555536412178637.jar tmpDir=null
16/04/15 00:39:28 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
16/04/15 00:39:28 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
16/04/15 00:39:28 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
16/04/15 00:39:28 INFO mapreduce.JobSubmitter: Cleaning up the staging area file:/tmp/hadoop-enlighter/mapred/staging/enlighter1664997312/.staging/job_local1664997312_0001
Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Illegal character in path at index 112: file:/run/media/enlighter/dd3546e2-4871-4fc6-a57e-7336392cb705/home/enlighter/workspace/dbms-lab/assign_4(Hadoop MapReduce project)/code/tester/mapper.py
at org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(JobResourceUploader.java:109)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:95)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:190)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561)
at org.apache.hadoop.streaming.StreamJob.submitAndMonitorJob(StreamJob.java:1014)
at org.apache.hadoop.streaming.StreamJob.run(StreamJob.java:135)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:50)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.net.URISyntaxException: Illegal character in path at index 112: file:/run/media/enlighter/dd3546e2-4871-4fc6-a57e-7336392cb705/home/enlighter/workspace/dbms-lab/assign_4(Hadoop MapReduce project)/code/tester/mapper.py
at java.net.URI$Parser.fail(URI.java:2848)
at java.net.URI$Parser.checkChars(URI.java:3021)
at java.net.URI$Parser.parseHierarchical(URI.java:3105)
at java.net.URI$Parser.parse(URI.java:3053)
at java.net.URI.<init>(URI.java:588)
at org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(JobResourceUploader.java:107)
... 26 more
执行似乎是我的mapper和reducer scipts所在的系统路径有问题。它无法正确解析路径中的特殊字符。
如何成功完成这项工作? 我是否需要更改任何文件夹名称?或者有更好的解决方案吗?
答案 0 :(得分:1)
我也遇到了同样的异常,我能够通过删除foldername中保留了mapper和reducer程序的空间来解决问题。