我是Hadoop MR的新手。我有一个4节点集群,有32个Map插槽和16个Reduce插槽。 Job使用761 Maps和2 Reducers处理接近100 GB的数据。
我的问题是为什么它只使用2个减速器。如果我错过了与Reducers相关的任何配置或预期,请告诉我。
我在mapreduce配置中设置了以下属性,但仍然使用了2个reducer。
每个作业的默认减少任务数
mapred.reduce.tasks = 8
日志:
15/12/30 14:58:56 INFO mapred.JobClient: Job complete: job_201512301313_0002
15/12/30 14:58:56 INFO mapred.JobClient: Counters: 33
15/12/30 14:58:56 INFO mapred.JobClient: File System Counters
15/12/30 14:58:56 INFO mapred.JobClient: FILE: Number of bytes read=11711801793
15/12/30 14:58:56 INFO mapred.JobClient: FILE: Number of bytes written=24324166884
15/12/30 14:58:56 INFO mapred.JobClient: FILE: Number of read operations=0
15/12/30 14:58:56 INFO mapred.JobClient: FILE: Number of large read operations=0
15/12/30 14:58:56 INFO mapred.JobClient: FILE: Number of write operations=0
15/12/30 14:58:56 INFO mapred.JobClient: HDFS: Number of bytes read=101855418108
15/12/30 14:58:56 INFO mapred.JobClient: HDFS: Number of bytes written=821001518
15/12/30 14:58:56 INFO mapred.JobClient: HDFS: Number of read operations=1536
15/12/30 14:58:56 INFO mapred.JobClient: HDFS: Number of large read operations=0
15/12/30 14:58:56 INFO mapred.JobClient: HDFS: Number of write operations=2
15/12/30 14:58:56 INFO mapred.JobClient: Job Counters
15/12/30 14:58:56 INFO mapred.JobClient: Launched map tasks=761
15/12/30 14:58:56 INFO mapred.JobClient: Launched reduce tasks=2
15/12/30 14:58:56 INFO mapred.JobClient: Data-local map tasks=753
15/12/30 14:58:56 INFO mapred.JobClient: Rack-local map tasks=8
15/12/30 14:58:56 INFO mapred.JobClient: Total time spent by all maps in occupied slots (ms)=10467348
15/12/30 14:58:56 INFO mapred.JobClient: Total time spent by all reduces in occupied slots (ms)=936182
15/12/30 14:58:56 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
15/12/30 14:58:56 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0