bigquery row_number基于特定字段的某些行组

时间:2015-05-05 07:10:23

标签: google-bigquery

我有这样的数据 即。

cityName      stateId
cityText1        52
cityText2        52
cityText3        52

cityExp1         72
cityExp2         72
cityExp3         72

city1            21
city2            21

我正在使用SUB QUERIES获取数据。 现在用 BIGQUERY 我想要这样的数据:

cityName      rowNumber
cityText1        1
cityText2        1
cityText3        1

cityExp1         2
cityExp2         2
cityExp3         2

city1            3
city2            3

我尝试使用 row_number(),但它为每一行提供了唯一的编号。所以这可能是我想要的。

1 个答案:

答案 0 :(得分:4)

首先需要将标量值加入一个分区,然后才能在该分区上应用ROW_NUMBER

更新:滚动到答案的底部,查看不使用标量的建议。

SELECT stateId,
       row_number() over (partition BY scalar) AS INDEX
FROM
  (SELECT stateId,
          1 AS scalar
   FROM
     (SELECT 'cityText1' AS cityName,
             52 AS stateId),
     (SELECT 'cityText2' AS cityName,
             52 AS stateId),
     (SELECT 'cityText3' AS cityName,
             52 AS stateId),
     (SELECT 'cityExp1' AS cityName,
             72 AS stateId),
     (SELECT 'cityExp2' AS cityName,
             72 AS stateId),
     (SELECT 'cityExp3' AS cityName,
             72 AS stateId),
     (SELECT 'city1' AS cityName,
             21 AS stateId),
     (SELECT 'city2' AS cityName,
             21 AS stateId)
   GROUP BY stateId) d

返回:

+-----+---------+-------+---+
| Row | stateId | index |   |
+-----+---------+-------+---+
|   1 |      52 |     1 |   |
|   2 |      72 |     2 |   |
|   3 |      21 |     3 |   |
+-----+---------+-------+---+

然后您可以再次加入表格并准备最终输出。对于我们的静态表,这是一个很长的查询:

SELECT t.cityName,
       t.stateId,
       d.index
FROM
  (SELECT *
   FROM
     (SELECT 'cityText1' AS cityName,
             52 AS stateId),
     (SELECT 'cityText2' AS cityName,
             52 AS stateId),
     (SELECT 'cityText3' AS cityName,
             52 AS stateId),
     (SELECT 'cityExp1' AS cityName,
             72 AS stateId),
     (SELECT 'cityExp2' AS cityName,
             72 AS stateId),
     (SELECT 'cityExp3' AS cityName,
             72 AS stateId),
     (SELECT 'city1' AS cityName,
             21 AS stateId),
     (SELECT 'city2' AS cityName,
             21 AS stateId)) t
JOIN
  (SELECT stateId,
          row_number() over (partition BY scalar) AS INDEX
   FROM
     (SELECT stateId,
             1 AS scalar
      FROM
        (SELECT 'cityText1' AS cityName,
                52 AS stateId),
        (SELECT 'cityText2' AS cityName,
                52 AS stateId),
        (SELECT 'cityText3' AS cityName,
                52 AS stateId),
        (SELECT 'cityExp1' AS cityName,
                72 AS stateId),
        (SELECT 'cityExp2' AS cityName,
                72 AS stateId),
        (SELECT 'cityExp3' AS cityName,
                72 AS stateId),
        (SELECT 'city1' AS cityName,
                21 AS stateId),
        (SELECT 'city2' AS cityName,
                21 AS stateId)
      GROUP BY stateId)) d ON d.stateId=t.stateId

这将返回最终输出:

+-----+------------+-----------+---------+---+
| Row | t_cityName | t_stateId | d_index |   |
+-----+------------+-----------+---------+---+
|   1 | cityText1  |        52 |       1 |   |
|   2 | cityText2  |        52 |       1 |   |
|   3 | cityText3  |        52 |       1 |   |
|   4 | cityExp1   |        72 |       2 |   |
|   5 | cityExp2   |        72 |       2 |   |
|   6 | cityExp3   |        72 |       2 |   |
|   7 | city1      |        21 |       3 |   |
|   8 | city2      |        21 |       3 |   |
+-----+------------+-----------+---------+---+

<强>更新 在没有标量的更新之后,查询变为:

SELECT stateId,
       row_number() over () AS INDEX
FROM

     (SELECT 'cityText1' AS cityName,
             52 AS stateId),
     (SELECT 'cityText2' AS cityName,
             52 AS stateId),
     (SELECT 'cityText3' AS cityName,
             52 AS stateId),
     (SELECT 'cityExp1' AS cityName,
             72 AS stateId),
     (SELECT 'cityExp2' AS cityName,
             72 AS stateId),
     (SELECT 'cityExp3' AS cityName,
             72 AS stateId),
     (SELECT 'city1' AS cityName,
             21 AS stateId),
     (SELECT 'city2' AS cityName,
             21 AS stateId)
   group by stateId