我有一个struct数组,我正在尝试查找struct column的count,sum,distinct值。
create table temp (regionkey smallint, name string, comment string, nations array<struct<n_nationkey:smallint,n_name:string,n_comment:string>>)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
MAP KEYS TERMINATED BY ',';
当我尝试运行查询时
select name,
count(nations.n_nationkey) as count,
sum(nations.n_nationkey) as sum,
ndv(nations.n_nationkey) as distinct_val
from temp
group by name
order by name;
我收到错误
FAILED: UDFArgumentTypeException Only primitive type arguments are accepted but array<smallint> is passed.
我想要做的是找到n_nationkey的计数,总和和不同的值。
任何帮助都将受到高度赞赏。
答案 0 :(得分:1)
select t.name
,count (e.col.n_nationkey) as count
,sum (e.col.n_nationkey) as sum
,count (distinct e.col.n_nationkey) as distinct_val
from temp t lateral view explode (t.nations) e
group by t.name
order by t.name
;
带别名的相同解决方案。
nations
不是结构。它是结构的数组
它没有n_nationkey
属性。它具有具有n_nationkey
属性的struct元素
explode
函数采用结构数组(nations
)并将每个结构(nation
)返回到单独的行中。
select t.name
,count (e.nation.n_nationkey) as count
,sum (e.nation.n_nationkey) as sum
,count (distinct e.nation.n_nationkey) as distinct_val
from temp t lateral view explode (t.nations) e as nation
group by t.name
order by t.name
;