使用GROUP BY时如何在分区内选择最佳行

时间:2015-04-17 13:38:46

标签: google-bigquery

在bigquery中,我希望能够通过在另一列上应用条件来选择不在列组列表中的列。

说我有以下专栏 group, id, datecreated

和以下查询: select group, max(datecreated) from table group by group

我希望查询还返回id

行的max(datecreated)

到目前为止,我所了解的是聚合函数仅适用于一列。这里有一个想法是连接创建的日期和id,得到MAX(),然后用正则表达式提取ID。 我觉得应该有一个更简单的解决方案。

1 个答案:

答案 0 :(得分:4)

您可以按组分组使用Window Functions,然后按时间顺序排序并选择第一个。

SELECT *
FROM
  (SELECT g,
          v,
          row_number() over (partition BY g
                             ORDER BY t DESC) AS POSITION
   FROM
     (SELECT 1 AS g,
             1 AS t,
             10 AS v),
     (SELECT 1 AS g,
             2 AS t,
             20 AS v),
     (SELECT 1 AS g,
             3 AS t,
             15 AS v))
WHERE POSITION=1

这个小数据集会返回

+---+----+----------+---+
| g | v  | position |   |
+---+----+----------+---+
| 1 | 15 |        1 |   |
+---+----+----------+---+