为什么没有/ sort by子句的命令的hive查询最终只有一个reducer?

时间:2013-08-27 13:53:24

标签: hadoop mapreduce hive reducers

我有一个简单的查询与流媒体作业相关联,该作业中没有order by语句。

set hive.exec.max.dynamic.partitions.pernode=100;
set hive.exec.max.dynamic.partitions=100;
set hive.exec.max.created.files=100;
set hive.exec.dynamic.partition.mode=nonstrict;
set mapred.reduce.tasks=20;
add file /home/devo/c1166313/pafvalid.py ;
add file /home/devo/c1166313/paf-rules.properties ;
from
 (from  
   (select * from mz_paf_errors_dummy_v) p
select transform (p.*)  row format delimited fields terminated by '|' 
using 'pafvalid.py paf-rules.properties 10'
as (<column list>)
row format delimited fields terminated by '|' )  b
insert overwrite table mytab partition (passfail, batch_sk) select <col list>;

这是一个中等规模的群集(几十台机器),地图集的数量超过两千。为什么需要一台减速机?

Number of reduce tasks determined at compile time: 1

0 个答案:

没有答案