Question

我在postgresql中有一个这样的表。每行显示一个客户订阅了我们的产品。例如，客户1在2019-07-03支付了1个月的订阅费用。

minishift delete

我想每天计算有效的不同订户总数，结果看起来像

minishift start

第一排中还有3个不同的客户（1、2、4）仍在2019-07-03订阅产品A。我尝试每天使用groupby和distinct count，

date       product period subscriber_id units
2019-07-03       A 1Month             1     1
2019-07-02       A  1Year             2     1
2019-07-01       B  1Year             1     1
2019-06-30       B 1Month             3     1
2019-06-30       A 1Month             4     1
2019-06-03       B 1Month             4     1
2019-06-03       A 1Month             1     1

我不知道如何在这种情况下分组。如果有更好的方法可以解决此问题。我将非常感谢！

Answer 1

如果使用日期范围，这非常简单。

CREATE TABLE SUBSCRIPTION (
  date date,
  product text,
  period interval,
  subscriber_id int,
  units int
);

INSERT INTO SUBSCRIPTION VALUES 
 ('2019-07-03', 'A' , '1 month', 1, 1),
 ('2019-07-02', 'A', '1 year', 2, 1),
 ('2019-07-01', 'B', '1 year', 1, 1),
 ('2019-06-30', 'B', '1 month', 3, 1),
 ('2019-06-30', 'A', '1 month', 4, 1),
 ('2019-06-03', 'B', '1 month', 4, 1),
 ('2019-06-03', 'A', '1 month', 1, 1);

-- First, get the list of dateranges, from 2019-06-03 to 2019-07-03 (or whatever you want)
WITH dates as (
  SELECT daterange(t::date, (t + interval '1' day)::date, '[)')
  FROM generate_series('2019-06-03'::timestamp without time zone,
                       '2019-07-03', 
                       interval '1' day) as g(t)
)
  SELECT lower(daterange)::date, count(distinct subscriber_id)
  FROM dates
  LEFT JOIN subscription ON daterange <@
                             daterange(subscription.date,
                                       (subscription.date + period)::date)
  GROUP BY daterange
  ;
   lower    | count
------------+-------
 2019-06-03 |     2
 2019-06-04 |     2
 2019-06-05 |     2
 2019-06-06 |     2
 2019-06-07 |     2
 2019-06-08 |     2
 2019-06-09 |     2
 2019-06-10 |     2
 2019-06-11 |     2
 2019-06-12 |     2
 2019-06-13 |     2
 2019-06-14 |     2
 2019-06-15 |     2
 2019-06-16 |     2
 2019-06-17 |     2
 2019-06-18 |     2
 2019-06-19 |     2
 2019-06-20 |     2
 2019-06-21 |     2
 2019-06-22 |     2
 2019-06-23 |     2
 2019-06-24 |     2
 2019-06-25 |     2
 2019-06-26 |     2
 2019-06-27 |     2
 2019-06-28 |     2
 2019-06-29 |     2
 2019-06-30 |     3
 2019-07-01 |     3
 2019-07-02 |     4
 2019-07-03 |     4
(31 rows)

您可以通过将订阅有效时间作为日期范围存储（并编制索引），而不用在查询中进行计算来提高性能。

编辑：正如Jay所指出的，我忘记了按产品分组：

WITH dates as (
  SELECT daterange(t::date, (t + interval '1' day)::date, '[)')
  FROM generate_series('2019-06-03'::timestamp without time zone,
                       '2019-07-03',
                       interval '1' day) as g(t)
)
  SELECT lower(daterange)::date, product, count(distinct subscriber_id)
  FROM dates
  LEFT JOIN subscription ON daterange <@
                             daterange(subscription.date,
                                       (subscription.date + period)::date)
  GROUP BY daterange, product
  ;
   lower    | product | count
------------+---------+-------
 2019-06-03 | A       |     1
 2019-06-03 | B       |     1
 2019-06-04 | A       |     1
 2019-06-04 | B       |     1
 2019-06-05 | A       |     1
 2019-06-05 | B       |     1
 2019-06-06 | A       |     1
 2019-06-06 | B       |     1
 2019-06-07 | A       |     1
 2019-06-07 | B       |     1
 2019-06-08 | A       |     1
 2019-06-08 | B       |     1
 2019-06-09 | A       |     1
 2019-06-09 | B       |     1
 2019-06-10 | A       |     1
 2019-06-10 | B       |     1
 2019-06-11 | A       |     1
 2019-06-11 | B       |     1
 2019-06-12 | A       |     1
 2019-06-12 | B       |     1
 2019-06-13 | A       |     1
 2019-06-13 | B       |     1
 2019-06-14 | A       |     1
 2019-06-14 | B       |     1
 2019-06-15 | A       |     1
 2019-06-15 | B       |     1
 2019-06-16 | A       |     1
 2019-06-16 | B       |     1
 2019-06-17 | A       |     1
 2019-06-17 | B       |     1
 2019-06-18 | A       |     1
 2019-06-18 | B       |     1
 2019-06-19 | A       |     1
 2019-06-19 | B       |     1
 2019-06-20 | A       |     1
 2019-06-20 | B       |     1
 2019-06-21 | A       |     1
 2019-06-21 | B       |     1
 2019-06-22 | A       |     1
 2019-06-22 | B       |     1
 2019-06-23 | A       |     1
 2019-06-23 | B       |     1
 2019-06-24 | A       |     1
 2019-06-24 | B       |     1
 2019-06-25 | A       |     1
 2019-06-25 | B       |     1
 2019-06-26 | A       |     1
 2019-06-26 | B       |     1
 2019-06-27 | A       |     1
 2019-06-27 | B       |     1
 2019-06-28 | A       |     1
 2019-06-28 | B       |     1
 2019-06-29 | A       |     1
 2019-06-29 | B       |     1
 2019-06-30 | A       |     2
 2019-06-30 | B       |     2
 2019-07-01 | A       |     2
 2019-07-01 | B       |     3
 2019-07-02 | A       |     3
 2019-07-02 | B       |     3
 2019-07-03 | A       |     3
 2019-07-03 | B       |     2

如何在PostgreSQL中按特定条件将每个日期分组？

1 个答案: