BigQuery的标准SQL-获取最近1、7和30天的出现次数

时间:2019-03-08 06:04:10

标签: sql google-bigquery

我想要一个查询结果,每列的值都显示该实体在最近1天,7天和30天内发生的次数。

我有类似的表格:

文档:

+-----+---------+-------------------------+
| dId | score   | datetime                |
+-----+---------+-------------------------+
| A   | 100     | 2019-03-08 16:17:34.043 |
| B   | 80      | 2019-02-15 16:17:34.043 |
| C   | 70      | 2019-03-08 16:17:34.043 |
+-----+---------+-------------------------+

实体:

+------+-----+
| name | dId |
+------+-----+
| e1   |   A |
| e2   |   A |
| e1   |   B |
| e1   |   C |
| e2   |   C |
+------+-----+

预期输出:

+------+----+----+------+
| name | 1D | 7D |  30D |
+------+----+----+-------
| e1   | 2  |  2 |   3  |
| e2   | 1  |  1 |   2  |
+------+----+----+------+

从最近30天获取记录的简单查询是:

SELECT * FROM document where datetime >= DATETIME_SUB(CURRENT_DATETIME(), INTERVAL 1 MONTH)

但是我如何才能在1,7,30天之内加入并获得记录数?

2 个答案:

答案 0 :(得分:1)

用例表达式

SELECT e.name,
SUM(CASE WHEN d.datetime>=DATETIME_SUB(CURRENT_DATETIME(), INTERVAL 1 DAY)
                  THEN 1 ELSE 0 END) AS  oneD,
SUM(CASE WHEN d.datetime>=DATETIME_SUB(CURRENT_DATETIME(), INTERVAL 7 DAY)
                  THEN 1 ELSE 0 END) AS sevenD ,
SUM(CASE WHEN d.datetime>=DATETIME_SUB(CURRENT_DATETIME(), INTERVAL 30 DAY)
                  THEN 1 ELSE 0 END) AS thirtyD
FROM
document d JOIN entity e ON d.did=e.did GROUP BY e.name

答案 1 :(得分:0)

我建议在BigQuery中使用COUNTIF()

SELECT e.name,
       COUNTIF(d.datetime >= DATETIME_SUB(CURRENT_DATETIME, INTERVAL 1 day)) AS day_1,
       COUNTIF(d.datetime >= DATETIME_SUB(CURRENT_DATETIME, INTERVAL 7 day)) AS day_7,
       COUNTIF(d.datetime >= DATETIME_SUB(CURRENT_DATETIME, INTERVAL 30 day)) AS day_30
FROM document d JOIN
     entity e
     ON d.did = e.did
GROUP BY e.name;

尽管current_datetime可以作为一个函数来引用(即使用()),但是括号是可选的,使用它们时没有任何价值。

此外,如果您以天为单位来测量时间段,则可能不想包括时间部分。如果是这样,您应该问另一个问题。