有没有办法计算Hive中每行具有特定值的列数。 我有数据看起来像输入,我想计算有多少列有价值' a'以及有多少列有价值' b'并获得输出'输出'。 有没有办法用Hive查询来完成这个?
答案 0 :(得分:2)
Hive中的一种方法是:
select ( (case when cl_1 = 'a' then 1 else 0 end) +
(case when cl_2 = 'a' then 1 else 0 end) +
(case when cl_3 = 'a' then 1 else 0 end) +
(case when cl_4 = 'a' then 1 else 0 end) +
(case when cl_5 = 'a' then 1 else 0 end)
) as count_a,
( (case when cl_1 = 'b' then 1 else 0 end) +
(case when cl_2 = 'b' then 1 else 0 end) +
(case when cl_3 = 'b' then 1 else 0 end) +
(case when cl_4 = 'b' then 1 else 0 end) +
(case when cl_5 = 'b' then 1 else 0 end)
) as count_b
from t;
要获得总计数,我建议您使用子查询并添加count_a
和count_b
。
答案 1 :(得分:2)
将lateral view
与explode
一起使用,并对其进行汇总。
select id
,sum(cast(col='a' as int)) as cnt_a
,sum(cast(col='b' as int)) as cnt_b
,sum(cast(col in ('a','b') as int)) as cnt_total
from tbl
lateral view explode(array(ci_1,ci_2,ci_3,ci_4,ci_5)) tbl as col
group by id