我试图将Teradata中的查询转换为HIVE QL(HDF)并努力寻找示例。 Teradata(我的功能性最终目标) - 想要表中的记录数,然后是每个growth_type_id值,最后是每个组的%。
select trim(growth_type_id) AS VAL, COUNT(1) AS cnt, SUM(cnt) over () as GRP_CNT,CNT/(GRP_CNT* 1.0000) AS perc
from acdw_apex_account_strategy
qualify perc > .01 group by val
注意:运行HDP-2.4.3.0-227
答案 0 :(得分:0)
select val
,cnt
,grp_cnt
,cnt/(grp_cnt* 1.0000) as perc
from (select trim(growth_type_id) as val
,count(*) as cnt
,sum(count(*)) over () as grp_cnt
from acdw_apex_account_strategy
group by trim(growth_type_id)
) t
where cnt/grp_cnt > 0.01
;
P.S。
您在GROUP BY之前使用QUALIFY,虽然Teradata语法是敏捷的,唯一的要求是SELECT / WITH子句将首先定位,但我强烈建议保持条款的标准顺序:
WITH - SELECT - FROM - WHERE - GROUP BY - HAVING - ORDER BY