说我的表看起来像这样:
Name,Subject,Score
Jon,English,80
Amy,Geography,70
Matt,English,90
Jon,Math,100
Jon,History,60
Amy,French,90
有没有办法使用collect_list
,以便我可以这样查询:
Jon: English:80; Math:100; History:60
Amy: Geography:70; French:90
Matt: English:90
编辑:
这里的复杂性是collect_list
UDF只允许一个参数,即一列。
像
SELECT name, collect_list(subject), collect_list(score) from mytable group by name
结果
Jon | [English,Math,History] | [80,100,60]
Amy | [Geography,French] | [70,90]
Matt | [English] | [90]
答案 0 :(得分:3)
不确定这是否是您所需要的。
select * from t0;
+-------+------------+-------+--+
| t0.a | t0.b | t0.c |
+-------+------------+-------+--+
| Jon | English | 80 |
| Amy | Geography | 70 |
| Matt | English | 90 |
| Jon | Math | 100 |
| Jon | History | 60 |
| Amy | French | 90 |
+-------+------------+-------+--+
select a, collect_list(concat_ws(':',b,cast(c as string))) from t0 group by a;
+-------+-----------------------------------------+--+
| a | _c1 |
+-------+-----------------------------------------+--+
| Amy | ["Geography:70","French:90"] |
| Jon | ["English:80","Math:100","History:60"] |
| Matt | ["English:90"] |
+-------+-----------------------------------------+--+