查询bigquery中最热门的事物?

时间:2019-02-25 20:10:02

标签: google-bigquery

我有多个位置的每日收入数据。这是一个示例:

+---------+------------+----------+------------+
|    ID   |  location  |  value   | timestamp  |
+---------+------------+----------+------------+
| 1       | LA         |  15.000  | 2019-02-12 |
| 2       | SF         |  23.000  | 2019-02-10 |
| 3       | NYC        |  9.000   | 2019-02-10 |
| 4       | LA         |  2.500   | 2019-02-09 |
+---------+------------+----------+------------+

我想找到排名前三的趋势位置。输出应如下所示:

+----------+------------+----------+----------------+
|   rank   |  location  |  growth  | growth_percent |
+----------+------------+----------+----------------+
| 1        | SF         |  23.000  | 0.75           |
| 2        | LA         |  17.500  | 0.62           |
| 3        | NYC        |  9.000   | 0.43           |
+----------+------------+----------+----------------+

我认为使用RANK()函数可以解决此问题。我开始:

SELECT location, 
  RANK() OVER (PARTITION BY location ORDER BY timestamp) as rank
FROM `revenues`
GROUP BY location, timestamp

但这会多次返回一个位置。关于如何创建此类趋势查询有何想法?

1 个答案:

答案 0 :(得分:1)

尝试这个:

WITH `data` AS(
  SELECT 1 AS ID, 'LA' AS location, 15000 AS value, '2019-02-12' AS timestamp UNION ALL
  SELECT 2 AS ID, 'SF' AS location, 23000 AS value, '2019-02-10' AS timestamp UNION ALL
  SELECT 3 AS ID, 'NYC' AS location, 9000 AS value, '2019-02-10' AS timestamp UNION ALL
  SELECT 4 AS ID, 'LA' AS location, 2500 AS value, '2019-02-09' AS timestamp
)

SELECT
  RANK() OVER (ORDER BY SUM(value) DESC) AS rank,
  location,
  SUM(value) AS growth
FROM `data`
GROUP BY
  location

这将导致:

[
  {
    "rank": "1",
    "location": "SF",
    "growth": "23000"
  },
  {
    "rank": "2",
    "location": "LA",
    "growth": "17500"
  },
  {
    "rank": "3",
    "location": "NYC",
    "growth": "9000"
  }
]

通过对位置进行分组,可以删除查询中观察到的重复项。