为什么组合器输入记录不仅仅是映射器输出记录?

时间:2016-03-29 14:37:00

标签: hadoop mapreduce yarn combiners mappers

Combiner适用于mapper的输出记录。如果将映射器输出记录输入到组合器,那么为什么我的组合器输入记录不仅仅是映射器输出记录?

我额外收到了80条记录。我不知道它们来自何处。他们的价值是什么。

Mapreduce的纱线转储:

 Map-Reduce Framework
            Map input records=80000000
            Map output records=80000000
            Map output bytes=2560000000
            Map output materialized bytes=80
            Input split bytes=220
            Combine input records=80000083
            Combine output records=85
            Reduce input groups=1
            Reduce shuffle bytes=80
            Reduce input records=2
            Reduce output records=3
            Spilled Records=87
            Shuffled Maps =2
            Failed Shuffles=0
            Merged Map outputs=2
            GC time elapsed (ms)=4124
            CPU time spent (ms)=90530
            Physical memory (bytes) snapshot=573521920
            Virtual memory (bytes) snapshot=2509766656
            Total committed heap usage (bytes)=411041792

0 个答案:

没有答案