我在postgresql中有一个这样的表。每行显示一个客户订阅了我们的产品。例如,客户1在2019-07-03支付了1个月的订阅费用。
minishift delete
我想每天计算有效的不同订户总数,结果看起来像
minishift start
第一排中还有3个不同的客户(1、2、4)仍在2019-07-03订阅产品A。 我尝试每天使用groupby和distinct count,
date product period subscriber_id units
2019-07-03 A 1Month 1 1
2019-07-02 A 1Year 2 1
2019-07-01 B 1Year 1 1
2019-06-30 B 1Month 3 1
2019-06-30 A 1Month 4 1
2019-06-03 B 1Month 4 1
2019-06-03 A 1Month 1 1
我不知道如何在这种情况下分组。如果有更好的方法可以解决此问题。我将非常感谢!
答案 0 :(得分:1)
如果使用日期范围,这非常简单。
CREATE TABLE SUBSCRIPTION (
date date,
product text,
period interval,
subscriber_id int,
units int
);
INSERT INTO SUBSCRIPTION VALUES
('2019-07-03', 'A' , '1 month', 1, 1),
('2019-07-02', 'A', '1 year', 2, 1),
('2019-07-01', 'B', '1 year', 1, 1),
('2019-06-30', 'B', '1 month', 3, 1),
('2019-06-30', 'A', '1 month', 4, 1),
('2019-06-03', 'B', '1 month', 4, 1),
('2019-06-03', 'A', '1 month', 1, 1);
-- First, get the list of dateranges, from 2019-06-03 to 2019-07-03 (or whatever you want)
WITH dates as (
SELECT daterange(t::date, (t + interval '1' day)::date, '[)')
FROM generate_series('2019-06-03'::timestamp without time zone,
'2019-07-03',
interval '1' day) as g(t)
)
SELECT lower(daterange)::date, count(distinct subscriber_id)
FROM dates
LEFT JOIN subscription ON daterange <@
daterange(subscription.date,
(subscription.date + period)::date)
GROUP BY daterange
;
lower | count
------------+-------
2019-06-03 | 2
2019-06-04 | 2
2019-06-05 | 2
2019-06-06 | 2
2019-06-07 | 2
2019-06-08 | 2
2019-06-09 | 2
2019-06-10 | 2
2019-06-11 | 2
2019-06-12 | 2
2019-06-13 | 2
2019-06-14 | 2
2019-06-15 | 2
2019-06-16 | 2
2019-06-17 | 2
2019-06-18 | 2
2019-06-19 | 2
2019-06-20 | 2
2019-06-21 | 2
2019-06-22 | 2
2019-06-23 | 2
2019-06-24 | 2
2019-06-25 | 2
2019-06-26 | 2
2019-06-27 | 2
2019-06-28 | 2
2019-06-29 | 2
2019-06-30 | 3
2019-07-01 | 3
2019-07-02 | 4
2019-07-03 | 4
(31 rows)
您可以通过将订阅有效时间作为日期范围存储(并编制索引),而不用在查询中进行计算来提高性能。
编辑:正如Jay所指出的,我忘记了按产品分组:
WITH dates as (
SELECT daterange(t::date, (t + interval '1' day)::date, '[)')
FROM generate_series('2019-06-03'::timestamp without time zone,
'2019-07-03',
interval '1' day) as g(t)
)
SELECT lower(daterange)::date, product, count(distinct subscriber_id)
FROM dates
LEFT JOIN subscription ON daterange <@
daterange(subscription.date,
(subscription.date + period)::date)
GROUP BY daterange, product
;
lower | product | count
------------+---------+-------
2019-06-03 | A | 1
2019-06-03 | B | 1
2019-06-04 | A | 1
2019-06-04 | B | 1
2019-06-05 | A | 1
2019-06-05 | B | 1
2019-06-06 | A | 1
2019-06-06 | B | 1
2019-06-07 | A | 1
2019-06-07 | B | 1
2019-06-08 | A | 1
2019-06-08 | B | 1
2019-06-09 | A | 1
2019-06-09 | B | 1
2019-06-10 | A | 1
2019-06-10 | B | 1
2019-06-11 | A | 1
2019-06-11 | B | 1
2019-06-12 | A | 1
2019-06-12 | B | 1
2019-06-13 | A | 1
2019-06-13 | B | 1
2019-06-14 | A | 1
2019-06-14 | B | 1
2019-06-15 | A | 1
2019-06-15 | B | 1
2019-06-16 | A | 1
2019-06-16 | B | 1
2019-06-17 | A | 1
2019-06-17 | B | 1
2019-06-18 | A | 1
2019-06-18 | B | 1
2019-06-19 | A | 1
2019-06-19 | B | 1
2019-06-20 | A | 1
2019-06-20 | B | 1
2019-06-21 | A | 1
2019-06-21 | B | 1
2019-06-22 | A | 1
2019-06-22 | B | 1
2019-06-23 | A | 1
2019-06-23 | B | 1
2019-06-24 | A | 1
2019-06-24 | B | 1
2019-06-25 | A | 1
2019-06-25 | B | 1
2019-06-26 | A | 1
2019-06-26 | B | 1
2019-06-27 | A | 1
2019-06-27 | B | 1
2019-06-28 | A | 1
2019-06-28 | B | 1
2019-06-29 | A | 1
2019-06-29 | B | 1
2019-06-30 | A | 2
2019-06-30 | B | 2
2019-07-01 | A | 2
2019-07-01 | B | 3
2019-07-02 | A | 3
2019-07-02 | B | 3
2019-07-03 | A | 3
2019-07-03 | B | 2