Hive Query中使用聚合列

时间:2014-10-08 10:30:55

标签: hadoop hive

我的hive表(tab1)结构:

people_id,time_spent,group_type
1,234,a
2,540,b
1,332,a
2,112,b

以下是我正在尝试执行的查询但收到错误(“UDAF'还不支持的地方'sum'”):

select people_id, sum(case when group_type='a' then time_spent else 0 end) as a_time, sum(pow(a_time,2)) as s_sq_a_time,sum(case when group_type='b' then time_spent else 0 end) as b_time, sum(pow(b_time,2)) as s_sq_b_time from tab1 group by people_id;

是否可以从Hive中的相同select语句引用聚合列? 我也在下面提到链接,但它没有工作: http://grokbase.com/t/hive/user/095tpdkrgz/built-in-aggregate-function-standard-deviation#

2 个答案:

答案 0 :(得分:1)

为表名设置别名,并在访问列时使用表别名。

E.g。

select startstation, count(tripid) as a
from 201508_trip_data as t
group by t.startstation

注意't'是表的别名,我使用t.startstation访问

答案 1 :(得分:0)

您必须使用派生表来引用a_timeb_time

select a_time, b_time,
pow(a_time,2) as s_sq_a_time,
pow(b_time,2) as s_sq_b_time
from (
    select people_id,
    sum(case when group_type='a' then time_spent else 0 end) as a_time,
    sum(case when group_type='b' then time_spent else 0 end) as b_time 
    from tab1 group by people_id
) t1