因此,当我运行时,我的脚本完美运行: 猫England.txt | ./mapperEngl.py |排序| ./reducerEngl.py
但是当我跑步时:
/ shared / hadoop / cur / bin / hadoop jar /shared/hadoop/cur/share/hadoop/tools/lib/hadoop-streaming-2.6.0.jar -file /home/hadoop/mapperEngl.py -mapper /home/hadoop/mapperEngl.py -file /home/hadoop/reducerEngl.py -reducer /home/hadoop/reducerEngl.py -input /datadir/England.txt -output /outputdir/climateresults3.txt
我收到以下错误:
16/05/03 09:27:15 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead.
16/05/03 09:27:15 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
packageJobJar: [/home/hadoop/mapperEngl.py, /home/hadoop/reducerEngl.py, /tmp/hadoop-unjar6814867016081507297/] [] /tmp/streamjob1585723008278678599.jar tmpDir=null
16/05/03 09:27:16 INFO client.RMProxy: Connecting to ResourceManager at mgmt-florida-poly-eth0/10.200.209.10:8032
16/05/03 09:27:16 INFO client.RMProxy: Connecting to ResourceManager at mgmt-florida-poly-eth0/10.200.209.10:8032
16/05/03 09:27:17 INFO mapred.FileInputFormat: Total input paths to process : 1
16/05/03 09:27:17 INFO mapreduce.JobSubmitter: number of splits:2
16/05/03 09:27:17 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1459438007195_0006
16/05/03 09:27:17 INFO impl.YarnClientImpl: Submitted application application_1459438007195_0006
16/05/03 09:27:17 INFO mapreduce.Job: The url to track the job: http://mgmt-florida-poly-eth0:8088/proxy/application_1459438007195_0006/
16/05/03 09:27:17 INFO mapreduce.Job: Running job: job_1459438007195_0006
16/05/03 09:27:25 INFO mapreduce.Job: Job job_1459438007195_0006 running in uber mode : false
16/05/03 09:27:25 INFO mapreduce.Job: map 0% reduce 0%
16/05/03 09:27:31 INFO mapreduce.Job: map 50% reduce 0%
16/05/03 09:27:32 INFO mapreduce.Job: map 100% reduce 0%
16/05/03 09:27:38 INFO mapreduce.Job: Task Id : attempt_1459438007195_0006_r_000000_0, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
at org.apache.hadoop.streaming.PipeReducer.close(PipeReducer.java:134)
at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:237)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
16/05/03 09:27:45 INFO mapreduce.Job: Task Id : attempt_1459438007195_0006_r_000000_1, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
at org.apache.hadoop.streaming.PipeReducer.close(PipeReducer.java:134)
at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:237)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
16/05/03 09:27:51 INFO mapreduce.Job: Task Id : attempt_1459438007195_0006_r_000000_2, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
at org.apache.hadoop.streaming.PipeReducer.close(PipeReducer.java:134)
at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:237)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
16/05/03 09:27:58 INFO mapreduce.Job: map 100% reduce 100%
16/05/03 09:27:58 INFO mapreduce.Job: Job job_1459438007195_0006 failed with state FAILED due to: Task failed task_1459438007195_0006_r_000000
Job failed as tasks failed. failedMaps:0 failedReduces:1
16/05/03 09:27:58 INFO mapreduce.Job: Counters: 37
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=228560
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=29265
HDFS: Number of bytes written=0
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=0
Job Counters
Failed reduce tasks=4
Launched map tasks=2
Launched reduce tasks=4
Rack-local map tasks=2
Total time spent by all maps in occupied slots (ms)=134880
Total time spent by all reduces in occupied slots (ms)=242432
Total time spent by all map tasks (ms)=8430
Total time spent by all reduce tasks (ms)=15152
Total vcore-seconds taken by all map tasks=8430
Total vcore-seconds taken by all reduce tasks=15152
Total megabyte-seconds taken by all map tasks=17264640
Total megabyte-seconds taken by all reduce tasks=31031296
Map-Reduce Framework
Map input records=107
Map output records=223
Map output bytes=9014
Map output materialized bytes=9472
Input split bytes=202
Combine input records=0
Spilled Records=223
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=0
CPU time spent (ms)=1540
Physical memory (bytes) snapshot=1305165824
Virtual memory (bytes) snapshot=5482422272
Total committed heap usage (bytes)=2022440960
File Input Format Counters
Bytes Read=29063
16/05/03 09:27:58 ERROR streaming.StreamJob: Job not successful!
Streaming Command Failed!
[hadoop@mgmt-florida-poly ~]$
我尝试过其他问题的解决方案但似乎没有用。
是的,完全被困在这里。
答案 0 :(得分:0)
疯狂,但我用#!/ usr / bin / python而不是#!/ usr / bin / python3
修复我的我认为我们的Hadoop群集配置存在问题。