有没有办法找到减速器被AM杀死的原因

时间:2014-06-29 15:06:32

标签: hadoop mapreduce yarn

我正在尝试使用1GB输入在Hadoop上运行一些图形处理作业。但我的减少任务正被Application Master杀死。 这是输出

14/06/29 16:15:02 INFO mapreduce.Job:  map 100% reduce 53%
14/06/29 16:15:03 INFO mapreduce.Job:  map 100% reduce 57%
14/06/29 16:15:04 INFO mapreduce.Job:  map 100% reduce 60%
14/06/29 16:15:05 INFO mapreduce.Job:  map 100% reduce 63%
14/06/29 16:15:05 INFO mapreduce.Job: Task Id : attempt_1404050864296_0002_r_000003_0, Status : FAILED
Container killed on request. Exit code is 137
Container exited with a non-zero exit code 137
Killed by external signal

并且最终作业失败并显示以下输出。

14/06/29 16:11:58 INFO mapreduce.Job: Job job_1404050864296_0001 failed with state FAILED due to: Task failed task_1404050864296_0001_r_000001
Job failed as tasks failed. failedMaps:0 failedReduces:1

14/06/29 16:11:58 INFO mapreduce.Job: Counters: 38
        File System Counters
                FILE: Number of bytes read=1706752372
                FILE: Number of bytes written=3414132444
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=1319319669
                HDFS: Number of bytes written=0
                HDFS: Number of read operations=30
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=0
        Job Counters
                Failed reduce tasks=7
                Killed reduce tasks=1
                Launched map tasks=10
                Launched reduce tasks=8
                Data-local map tasks=10
                Total time spent by all maps in occupied slots (ms)=12527776
                Total time spent by all reduces in occupied slots (ms)=1256256
                Total time spent by all map tasks (ms)=782986
                Total time spent by all reduce tasks (ms)=78516
                Total vcore-seconds taken by all map tasks=782986
                Total vcore-seconds taken by all reduce tasks=78516
                Total megabyte-seconds taken by all map tasks=6263888000
                Total megabyte-seconds taken by all reduce tasks=628128000
        Map-Reduce Framework
                Map input records=85331845
                Map output records=170663690
                Map output bytes=1365309520
                Map output materialized bytes=1706637020
                Input split bytes=980
                Combine input records=0
                Spilled Records=341327380
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=2573
                CPU time spent (ms)=820310
                Physical memory (bytes) snapshot=18048614400
                Virtual memory (bytes) snapshot=72212246528
                Total committed heap usage (bytes)=28289007616
        File Input Format Counters
                Bytes Read=1319318689
Exception in thread "main" java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
        at pegasus.DegDist.run(DegDist.java:201)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at pegasus.DegDist.main(DegDist.java:158)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

我已经检查了日志,但没有任何信息为什么减少任务被杀死。有没有办法找出为什么这个减少任务被杀死了。我在杀死减少工作的具体原因上有所作为。

0 个答案:

没有答案