计算Hive列中类别的百分比

时间:2018-09-23 15:18:14

标签: hadoop hive hdfs

我在Hive中有一个表格colors,看起来像这样:

 id cname
 1 Blue
 2 Green
 3 Green
 4 Blue
 5 Blue

我在编写Hive查询时需要帮助,该查询在cname列中给出每种颜色的百分比。看起来像这样:

Blue  60%
Green 40%

谢谢!

1 个答案:

答案 0 :(得分:1)

使用分析功能:

select cname, concat(pct, ' %') pct
from
(
select (
        count(*) over (partition by cname)/
        count(*) over ()
       )*100 as pct,
       cname
  from (--Replace this subquery with your table
        select stack (5,
                      1, 'Blue',
                      2, 'Green',
                      3, 'Green',
                      4, 'Blue',
                      5, 'Blue' )  as (id, cname)

        ) colors
)s
group by cname, pct;

结果:

OK
Blue    60.0 %
Green   40.0 %

只需将colors子查询替换为您的表