错误:SELECT列表表达式引用了created_utc列,该列既未在[2:49]进行分组也未聚合

时间:2018-03-01 01:59:11

标签: google-bigquery

我有一张桌子:t

我的目标:只提取" id"在表中得分最高,并按周数分组。

查询:

SELECT id, 
       CAST(EXTRACT(WEEK FROM TIMESTAMP_SECONDS(created_utc)) AS STRING) AS week_number, 
       MAX(score) AS highest_score
FROM t 
WHERE body='r/twinpeaks'
GROUP BY id;

我收到此错误: 错误:SELECT列表表达式引用列created_utc,它既未在[2:49]

分组也未聚合

我尝试这样做:

SELECT id, 
       CAST(EXTRACT(WEEK FROM TIMESTAMP_SECONDS(created_utc)) AS STRING) AS week_number, 
       MAX(score) AS highest_score
FROM  t 
WHERE body='r/twinpeaks'
GROUP BY week_number, id;

但这就是我得到的:

Row  id            week_number  highest_score    
1    dmkb6sv      36            1    
2    dn1cd2s      37            2    
3    dn43h1k      38            16   
4    dn3xf18      38            1    
5    dn7i1ko      38            1
6    dnpr9b1      39            1

我想要这个:

Row  id            week_number  highest_score    
1    dmkb6sv      36            1    
2    dn1cd2s      37            2    
3    dn43h1k      38            16   
6    dnpr9b1      39            1

2 个答案:

答案 0 :(得分:3)

您可以尝试在此使用ROW_NUMBER

SELECT *
FROM
(
    SELECT id,
        CAST(EXTRACT(WEEK FROM TIMESTAMP_SECONDS(created_utc)) AS STRING) AS week_number,
        score,
        ROW_NUMBER() OVER (PARTITION BY
            CAST(EXTRACT(WEEK FROM TIMESTAMP_SECONDS(created_utc)) AS STRING)
            ORDER BY score DESC) rn
    FROM t 
    WHERE body = 'r/twinpeaks'
) t
WHERE rn = 1;

这将返回每个相应周数的得分最高的记录。我在这里假设你要么不关心第一关系,要么不关系。如果你需要处理关系,那么可以使用秩函数代替行号。

答案 1 :(得分:3)

以下是BigQuery Standard SQL

     
#standardSQL
SELECT 
  EXTRACT(WEEK FROM TIMESTAMP_SECONDS(created_utc)) AS week_number,
  ARRAY_AGG(id ORDER BY score DESC LIMIT 1)[OFFSET(0)] id,
  ARRAY_AGG(score ORDER BY score DESC LIMIT 1)[OFFSET(0)] highest_score    
FROM `project.dataset.table` 
WHERE body = 'r/twinpeaks'
GROUP BY week_number
ORDER BY week_number