我希望能够通过BigQuery中的分区进行一些计算,然后只为每个分区输出1行(而不是为每个分区输出一行)。例如,如果我有这样的表:
Category | Location | Count
A | 'home' | 20
A | 'work' | 10
A | 'lab' | 6
B | 'home' | 5
C | 'lab' | 15
C | 'home' | 25
我想结束这个结果
Category | TopLocation | TopCount | SecondLocation | SecondCount
A | 'home' | 20 | 'work' | 10
B | 'home' | 5 | NULL | NULL
C | 'home' | 25 | 'lab' | 15
我认为我可以使用分区执行此操作,但最终会为每个值生成一行,而不是我想要的单行,因此我按类别分组并使用FIRST
。有没有更好的方法来避免生成这么多中间行(并且希望避免“窗口函数的大问题”)。
SELECT
category,
FIRST(TopLocation) TopLocation,
FIRST(TopCount) TopCount,
FIRST(SecondLocation) SecondLocation,
FIRST(SecondCount) SecondCount,
FROM
(SELECT
category,
NTH_VALUE(Location, 1) OVER (PARTITION BY category ORDER BY count) TopLocation,
NTH_VALUE(Count, 1) OVER (PARTITION BY category ORDER BY count) TopCount,
NTH_VALUE(Location, 2) OVER (PARTITION BY category ORDER BY count) SecondLocation,
NTH_VALUE(Count, 1) OVER (PARTITION BY category ORDER BY count) SecondCount
FROM
mytable
)
GROUP BY
category
ORDER BY
category DESC
答案 0 :(得分:0)
更新:使用#standardSQL
的更好解决方案怎么样:
SELECT word, word_count, corpus, rank FROM (
SELECT word, word_count, corpus,
RANK() OVER (PARTITION BY corpus ORDER BY word_count DESC) rank
FROM [publicdata:samples.shakespeare]
WHERE word_count > 6
)
WHERE rank<3
答案 1 :(得分:0)
这应该做的工作:
select category,
first(if(rank = 1, location, null)) as location_1, first(if(rank = 1, count, null)) as count_1,
first(if(rank = 2, location, null)) as location_2, first(if(rank = 2, count, null)) as count_2,
first(if(rank = 3, location, null)) as location_3, first(if(rank = 3, count, null)) as count_3
from
(select row_number() over (partition by category order by count desc) as rank, *
from
(select 'A' as category, 'home' AS location, 20 as count),
(select 'A' as category, 'work' AS location, 10 as count),
(select 'A' as category, 'lab' AS location, 6 as count),
(select 'B' as category, 'home' AS location, 5 as count),
(select 'C' as category, 'lab' AS location, 15 as count),
(select 'C' as category, 'home' AS location, 25 as count)
)
group by category order by category
结果:
Row category location_1 count_1 location_2 count_2 location_3 count_3
1 A home 20 work 10 lab 6
3 B home 5 null null null null
2 C home 25 lab 15 null null
但可能无法通过“大查询结果”解决问题&#39;在窗口功能