如何在相关表之外的SQL中使用定位日期

时间:2019-08-20 13:57:01

标签: sql postgresql

我正在尝试总结表中未包含的某些数据。

我有一个包含user_id,transaction_id,created_at和completed_at的事务表。

在created_at和completed_at之间,有待处理的交易

我希望拥有过去30天以及每天有未决交易的用户数。

我试图 A)生成一系列的日期,然后加入我的源表。 B)选择不同的created_at天,然后从我的源表中选择 C)将完整结果导出到googlesheet(太大)

WITH raw as (
  SELECT a.user_id, 
         t.id, 
         date_trunc('day',t.created_at) created_at,
        date_trunc('day',t.modified_at) modified_at
   FROM transaction t
 )
Select anchor_day,
        COUNT(distinct user_id) FILTER (where created_at <= anchor_day AND modified_at > anchor_day)
FROM raw;

样本数据

+---------+----------------+---------------------------+---------------------------+
| user_id | transaction_id |        created_at         |       completed_at        |
+---------+----------------+---------------------------+---------------------------+
| abcdefg |              1 | August 1, 2019, 12:00 AM  | August 7, 2019, 12:00 AM  |
| abcdefg |              2 | August 1, 2019, 12:00 AM  | August 9, 2019, 12:00 AM  |
| abcdefg |              3 | August 12, 2019, 12:00 AM | August 16, 2019, 12:00 AM |
| hijklmn |              4 | August 7, 2019, 12:00 AM  | August 11, 2019, 12:00 AM |
| opqrstu |              5 | August 8, 2019, 12:00 AM  | August 17, 2019, 12:00 AM |
| opqrstu |              6 | August 8, 2019, 12:00 AM  | August 16, 2019, 12:00 AM |
+---------+----------------+---------------------------+---------------------------+

所需的输出:

+--------------------------+-------------------------------------------+
|           Day            | Number of users with pending transactions |
+--------------------------+-------------------------------------------+
| August 1, 2019, 12:00 AM |                                         2 |
| August 2, 2019, 12:00 AM |                                         2 |
| August 3, 2019, 12:00 AM |                                         2 |
| August 4, 2019, 12:00 AM |                                         2 |
| August 5, 2019, 12:00 AM |                                         2 |
| August 6, 2019, 12:00 AM |                                         2 |
| August 7, 2019, 12:00 AM |                                         1 |
+--------------------------+-------------------------------------------+

1 个答案:

答案 0 :(得分:1)

您可以使用generate_series()

select gs.dte,
       (select count(distinct r.user_id)
        from raw r
        where r.created_at <= gs.dte and r.modified_at > gs.dte
       ) as num_users
from generate_series(current_date - interval '1 month', current_date, interval '1 day') gs(dte)
order by gs.dte;