我的问题陈述就像
“找到每个州人口最多的前2个地区”
数据就像
我的预期输出是
我尝试了很多查询和子查询,但是子查询导致SQL错误
有人可以帮助我获得此结果吗?
谢谢。
我尝试过的查询
按州名分组的人群
答案 0 :(得分:0)
下面是查询-
select A.state, collect_set(A.dist)[0], collect_set(A.dist)[1] from
(select state, dist, row_number() over (partition by state order by population
desc) as rnk from <tableName>) A
where A.rnk<=2 group by A.state;
以下是示例数据的结果-
hive> select * from hier;
OK
C1 C11
C11 C12
C12 123
P1 C1
P2 C2
hive> select parent, collect_set(child)[0], collect_set(child)[1] from hier group by parent;
OK
C1 C11 NULL
C11 C12 NULL
C12 123 NULL
P1 C1 NULL
P2 C2 NULL
Time taken: 19.212 seconds, Fetched: 5 row(s)