HIVE Sum Over查询

时间:2016-11-26 19:47:35

标签: partitioning hiveql olap hortonworks-data-platform

我试图将Teradata中的查询转换为HIVE QL(HDF)并努力寻找示例。 Teradata(我的功能性最终目标) - 想要表中的记录数,然后是每个growth_type_id值,最后是每个组的%。

select  trim(growth_type_id)      AS VAL, COUNT(1) AS cnt, SUM(cnt) over () as GRP_CNT,CNT/(GRP_CNT* 1.0000) AS perc 
from acdw_apex_account_strategy
 qualify perc > .01 group by val 

注意:运行HDP-2.4.3.0-227

1 个答案:

答案 0 :(得分:0)

select      val
           ,cnt
           ,grp_cnt
           ,cnt/(grp_cnt* 1.0000) as perc 

from       (select      trim(growth_type_id)    as val
                       ,count(*)                as cnt
                       ,sum(count(*)) over ()   as grp_cnt

            from        acdw_apex_account_strategy 

            group by    trim(growth_type_id)
            ) t

where       cnt/grp_cnt > 0.01 
;

P.S。
您在GROUP BY之前使用QUALIFY,虽然Teradata语法是敏捷的,唯一的要求是SELECT / WITH子句将首先定位,但我强烈建议保持条款的标准顺序:
WITH - SELECT - FROM - WHERE - GROUP BY - HAVING - ORDER BY