Combiner适用于mapper的输出记录。如果将映射器输出记录输入到组合器,那么为什么我的组合器输入记录不仅仅是映射器输出记录?
我额外收到了80条记录。我不知道它们来自何处。他们的价值是什么。
Mapreduce的纱线转储:
Map-Reduce Framework
Map input records=80000000
Map output records=80000000
Map output bytes=2560000000
Map output materialized bytes=80
Input split bytes=220
Combine input records=80000083
Combine output records=85
Reduce input groups=1
Reduce shuffle bytes=80
Reduce input records=2
Reduce output records=3
Spilled Records=87
Shuffled Maps =2
Failed Shuffles=0
Merged Map outputs=2
GC time elapsed (ms)=4124
CPU time spent (ms)=90530
Physical memory (bytes) snapshot=573521920
Virtual memory (bytes) snapshot=2509766656
Total committed heap usage (bytes)=411041792