我正在执行Hadoop权威指南第2章中的最高温度示例,我注意到Java示例的分割数量与使用Python的Hadoop Streaming不同。有人能帮我理解这种差异背后的原因吗?
Java示例输出:
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Rack-local map tasks=1
Total time spent by all maps in occupied slots (ms)=7007
Total time spent by all reduces in occupied slots (ms)=5760
Total time spent by all map tasks (ms)=7007
Total time spent by all reduce tasks (ms)=5760
Total vcore-seconds taken by all map tasks=7007
Total vcore-seconds taken by all reduce tasks=5760
Total megabyte-seconds taken by all map tasks=7175168
Total megabyte-seconds taken by all reduce tasks=5898240
使用Python示例输出Hadoop流:
Job Counters
Launched map tasks=2
Launched reduce tasks=1
Rack-local map tasks=2
Total time spent by all maps in occupied slots (ms)=16730
Total time spent by all reduces in occupied slots (ms)=4673
Total time spent by all map tasks (ms)=16730
Total time spent by all reduce tasks (ms)=4673
Total vcore-seconds taken by all map tasks=16730
Total vcore-seconds taken by all reduce tasks=4673
Total megabyte-seconds taken by all map tasks=17131520
Total megabyte-seconds taken by all reduce tasks=4785152