我正在尝试总结表中未包含的某些数据。
我有一个包含user_id,transaction_id,created_at和completed_at的事务表。
在created_at和completed_at之间,有待处理的交易
我希望拥有过去30天以及每天有未决交易的用户数。
我试图 A)生成一系列的日期,然后加入我的源表。 B)选择不同的created_at天,然后从我的源表中选择 C)将完整结果导出到googlesheet(太大)
WITH raw as (
SELECT a.user_id,
t.id,
date_trunc('day',t.created_at) created_at,
date_trunc('day',t.modified_at) modified_at
FROM transaction t
)
Select anchor_day,
COUNT(distinct user_id) FILTER (where created_at <= anchor_day AND modified_at > anchor_day)
FROM raw;
样本数据
+---------+----------------+---------------------------+---------------------------+
| user_id | transaction_id | created_at | completed_at |
+---------+----------------+---------------------------+---------------------------+
| abcdefg | 1 | August 1, 2019, 12:00 AM | August 7, 2019, 12:00 AM |
| abcdefg | 2 | August 1, 2019, 12:00 AM | August 9, 2019, 12:00 AM |
| abcdefg | 3 | August 12, 2019, 12:00 AM | August 16, 2019, 12:00 AM |
| hijklmn | 4 | August 7, 2019, 12:00 AM | August 11, 2019, 12:00 AM |
| opqrstu | 5 | August 8, 2019, 12:00 AM | August 17, 2019, 12:00 AM |
| opqrstu | 6 | August 8, 2019, 12:00 AM | August 16, 2019, 12:00 AM |
+---------+----------------+---------------------------+---------------------------+
所需的输出:
+--------------------------+-------------------------------------------+
| Day | Number of users with pending transactions |
+--------------------------+-------------------------------------------+
| August 1, 2019, 12:00 AM | 2 |
| August 2, 2019, 12:00 AM | 2 |
| August 3, 2019, 12:00 AM | 2 |
| August 4, 2019, 12:00 AM | 2 |
| August 5, 2019, 12:00 AM | 2 |
| August 6, 2019, 12:00 AM | 2 |
| August 7, 2019, 12:00 AM | 1 |
+--------------------------+-------------------------------------------+
答案 0 :(得分:1)
您可以使用generate_series()
:
select gs.dte,
(select count(distinct r.user_id)
from raw r
where r.created_at <= gs.dte and r.modified_at > gs.dte
) as num_users
from generate_series(current_date - interval '1 month', current_date, interval '1 day') gs(dte)
order by gs.dte;