表太大时ROW_NUMBER()失败

时间:2018-12-31 21:26:37

标签: sql google-bigquery row-number

我正在使用Bigquery,因此我需要使用ROW_NUMBER()才能仅获取符合某些条件的第一行。

示例:

select *except(rn)
from (
SELECT
  *,
  ROW_NUMBER() OVER (PARTITION BY id order by timedate desc) AS rn
FROM
 table
)
where rn = 1

但是,查询将因为表太大而失败。如何应用这种逻辑而又不会耗尽资源?

1 个答案:

答案 0 :(得分:5)

以下是用于BigQuery标准SQL

#standardSQL
SELECT AS VALUE ARRAY_AGG(t ORDER BY timedate DESC LIMIT 1)[OFFSET(0)]
FROM `project.dataset.table` t
GROUP BY id

您可以测试以下虚拟数据并在其上播放

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 1 id, 2 timedate, 3 z UNION ALL
  SELECT 1,4,5 UNION ALL
  SELECT 1,6,7 UNION ALL
  SELECT 2,8,9 UNION ALL
  SELECT 2, 10, 11
)
SELECT AS VALUE ARRAY_AGG(t ORDER BY timedate DESC LIMIT 1)[OFFSET(0)]
FROM `project.dataset.table` t
GROUP BY id

结果是

Row id  timedate    z    
1   1   6           7    
2   2   10          11