我有一张桌子:t
我的目标:只提取" id"在表中得分最高,并按周数分组。
查询:
SELECT id,
CAST(EXTRACT(WEEK FROM TIMESTAMP_SECONDS(created_utc)) AS STRING) AS week_number,
MAX(score) AS highest_score
FROM t
WHERE body='r/twinpeaks'
GROUP BY id;
我收到此错误: 错误:SELECT列表表达式引用列created_utc,它既未在[2:49]
分组也未聚合我尝试这样做:
SELECT id,
CAST(EXTRACT(WEEK FROM TIMESTAMP_SECONDS(created_utc)) AS STRING) AS week_number,
MAX(score) AS highest_score
FROM t
WHERE body='r/twinpeaks'
GROUP BY week_number, id;
但这就是我得到的:
Row id week_number highest_score
1 dmkb6sv 36 1
2 dn1cd2s 37 2
3 dn43h1k 38 16
4 dn3xf18 38 1
5 dn7i1ko 38 1
6 dnpr9b1 39 1
我想要这个:
Row id week_number highest_score
1 dmkb6sv 36 1
2 dn1cd2s 37 2
3 dn43h1k 38 16
6 dnpr9b1 39 1
答案 0 :(得分:3)
您可以尝试在此使用ROW_NUMBER
:
SELECT *
FROM
(
SELECT id,
CAST(EXTRACT(WEEK FROM TIMESTAMP_SECONDS(created_utc)) AS STRING) AS week_number,
score,
ROW_NUMBER() OVER (PARTITION BY
CAST(EXTRACT(WEEK FROM TIMESTAMP_SECONDS(created_utc)) AS STRING)
ORDER BY score DESC) rn
FROM t
WHERE body = 'r/twinpeaks'
) t
WHERE rn = 1;
这将返回每个相应周数的得分最高的记录。我在这里假设你要么不关心第一关系,要么不关系。如果你需要处理关系,那么可以使用秩函数代替行号。
答案 1 :(得分:3)
以下是BigQuery Standard SQL
#standardSQL
SELECT
EXTRACT(WEEK FROM TIMESTAMP_SECONDS(created_utc)) AS week_number,
ARRAY_AGG(id ORDER BY score DESC LIMIT 1)[OFFSET(0)] id,
ARRAY_AGG(score ORDER BY score DESC LIMIT 1)[OFFSET(0)] highest_score
FROM `project.dataset.table`
WHERE body = 'r/twinpeaks'
GROUP BY week_number
ORDER BY week_number