使用Python的Hadoop Streaming不起作用

时间:2018-01-12 15:11:39

标签: python hadoop hadoop-streaming

我正在尝试使用python进行流式传输Edureka's tutorial,一切都很好,但是当我运行脚本时

`hadoop jar /home/carlos/hadoop-2.7.3/share/hadoop/tools/lib/hadoop-streaming-2.7.3.jar -file /home/carlos/mapper.py -mapper mapper.py -file /home/carlos/reducer.py -reducer reducer.py -input /Streaming/word.txt -output /Streaming/WordCount.txt`

它抛出此错误:

18/01/12 08:28:10 INFO mapreduce.Job: Running job: job_1515762484609_0001
18/01/12 08:29:36 INFO mapreduce.Job: Job job_1515762484609_0001 running in uber mode : false
18/01/12 08:29:36 INFO mapreduce.Job:  map 0% reduce 0%
18/01/12 08:31:19 INFO mapreduce.Job:  map 33% reduce 0%
18/01/12 08:31:57 INFO mapreduce.Job:  map 50% reduce 0%
18/01/12 08:32:02 INFO mapreduce.Job:  map 0% reduce 0%
18/01/12 08:32:30 INFO mapreduce.Job: Task Id : attempt_1515762484609_0001_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

0 个答案:

没有答案