我有一组帐户余额:
+---------+---------------+---------+------------+
| ID | customer_id | value | timestamp |
+---------+---------------+---------+------------+
| 1 | 1 | -200 | 2019-11-18 |
| 2 | 1 | 100 | 2019-11-17 |
| 3 | 1 | -500 | 2019-11-16 |
| 4 | 1 | -200 | 2019-11-15 |
| 5 | 2 | 200 | 2019-11-15 |
| 6 | 1 | 0 | 2019-11-14 |
+---------+---------------+---------+------------+
我想获取自上次出现正账户余额以来客户负账户余额的连续天数。结果应如下所示:
+---------------+---------------------------------+------------+
| customer_id | Negative account balance since | Date |
+---------------+---------------------------------+------------+
| 1 | 1 day | 2019-11-18 |
+---------------+---------------------------------+------------+
客户#1经历了几天的消极日子,但由于在2019-11-17出现了积极的一天,因此重新开始了反击。日期列显示该客户最近的负帐户余额记录的日期。客户#2不在结果中,因为他根本没有负面日子。如何在BQ中创建此类查询?
答案 0 :(得分:1)
以下是用于BigQuery标准SQL
#standardSQL
WITH last_positive AS (
SELECT customer_id, ARRAY_AGG(`timestamp` ORDER BY `timestamp` DESC LIMIT 1)[OFFSET(0)] `timestamp`
FROM `project.dataset.table`
WHERE value >= 0
GROUP BY customer_id
), last_any AS (
SELECT customer_id, MAX(`timestamp`) `timestamp`
FROM `project.dataset.table`
GROUP BY customer_id
)
SELECT customer_id, DATE_DIFF(a.timestamp, b.timestamp, DAY) days_since, DATE_ADD(b.timestamp, INTERVAL 1 DAY) `timestamp`
FROM last_any a
JOIN last_positive b
USING(customer_id)
WHERE a.timestamp > b.timestamp
如以下示例所示,适用于您问题的样本数据
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 id, 1 customer_id, -200 value, DATE '2019-11-18' `timestamp` UNION ALL
SELECT 2, 1, 100, '2019-11-17' UNION ALL
SELECT 3, 1, -500, '2019-11-16' UNION ALL
SELECT 4, 1, -200, '2019-11-15' UNION ALL
SELECT 5, 2, 200, '2019-11-15' UNION ALL
SELECT 6, 1, 0, '2019-11-14'
), last_positive AS (
SELECT customer_id, ARRAY_AGG(`timestamp` ORDER BY `timestamp` DESC LIMIT 1)[OFFSET(0)] `timestamp`
FROM `project.dataset.table`
WHERE value >= 0
GROUP BY customer_id
), last_any AS (
SELECT customer_id, MAX(`timestamp`) `timestamp`
FROM `project.dataset.table`
GROUP BY customer_id
)
SELECT customer_id, DATE_DIFF(a.timestamp, b.timestamp, DAY) days_since, DATE_ADD(b.timestamp, INTERVAL 1 DAY) `timestamp`
FROM last_any a
JOIN last_positive b
USING(customer_id)
WHERE a.timestamp > b.timestamp
结果是
Row customer_id days_since timestamp
1 1 1 2019-11-18