我们有以下活动表,并希望查询它以获取每个月和上个月的唯一身份用户数。日期字段(createdat
)是timestamp
。查询需要在PostgreSQL中工作。
活动表:
| id | userid | createdat | username |
|--------|--------|-------------------------|----------------|
| 1d658a | 4957f3 | 2016-12-06 21:16:35:942 | Tom Jones |
| 3a86e3 | 684edf | 2016-12-03 21:16:35:943 | Harry Smith |
| 595756 | 582107 | 2016-12-26 21:16:35:944 | William Hanson |
| 2c87fe | 784723 | 2016-12-07 21:16:35:945 | April Cordon |
| 32509a | 4957f3 | 2016-12-20 21:16:35:946 | Tom Jones |
| 72e703 | 582107 | 2017-01-01 21:16:35:947 | William Hanson |
| 6d658a | 582107 | 2016-12-06 21:16:35:948 | William Hanson |
| 5c077c | 5934c4 | 2016-12-06 21:16:35:949 | Sandra Holmes |
| 92142b | 57ea5c | 2016-12-15 21:16:35:950 | Lucy Lawless |
| 3dd0a6 | 5934c4 | 2016-12-04 21:16:35:951 | Sandra Holmes |
| 43509a | 4957f3 | 2016-11-20 21:16:35:946 | Tom Jones |
| 85142b | 57ea5c | 2016-11-15 21:16:35:950 | Lucy Lawless |
| 7c87fe | 784723 | 2017-1-07 21:16:35:945 | April Cordon |
| 9c87fe | 784723 | 2017-2-07 21:16:35:946 | April Cordon |
结果:
| Month | UserThis Month | UserPreviousMonth |
|----------|----------------|-------------------|
| Dec 2016 | 6 | 2 |
| Jan 2017 | 2 | 6 |
| Feb 2017 | 1 | 2 |
答案 0 :(得分:1)
您可以尝试此查询。 to_char
获取MON YYYY
,您可以尝试使用lag
windows函数编写子查询以获得UserPreviousMonth
次数。
SELECT *
FROM (SELECT To_char(createdat, 'MON YYYY') Months,
Count(DISTINCT username) UserThisMonth,
Lag(Count(DISTINCT username)) OVER (
ORDER BY Date_part('year', createdat),
Date_part('month',createdat)
) UserPreviousMonth
FROM t
GROUP BY Date_part('year', createdat),
To_char(createdat, 'MON YYYY'),
Date_part('month', createdat)) t
WHERE userpreviousmonth IS NOT NULL
sqlfiddle:http://sqlfiddle.com/#!15/45e52/2
| months | userthismonth | userpreviousmonth |
|----------|---------------|-------------------|
| DEC 2016 | 6 | 2 |
| JAN 2017 | 2 | 6 |
| FEB 2017 | 1 | 2 |
修改强>
Dec 2016
和Jan 2017
...的类型必须为字符串,因为DateTime
需要完整日期,例如2017-01-01
。如果您需要对图表进行排序和使用,我建议您对此查询years
和months
列进行排序,然后在前端创建日期字符串。
SELECT *
FROM (SELECT Date_part('year', createdat) years,
Date_part('month', createdat) months,
Count(DISTINCT username) UserThisMonth,
Lag(Count(DISTINCT username)) OVER (
ORDER BY Date_part('year', createdat),
Date_part('month',createdat)
) UserPreviousMonth
FROM user_activity
GROUP BY Date_part('year', createdat),
Date_part('month', createdat)) t
WHERE userpreviousmonth IS NOT NULL
sqlfiddle:http://sqlfiddle.com/#!15/2da2b/4
| years | months | userthismonth | userpreviousmonth |
|-------|--------|---------------|-------------------|
| 2016 | 12 | 6 | 2 |
| 2017 | 1 | 2 | 6 |
| 2017 | 2 | 1 | 2 |
答案 1 :(得分:1)
修改强> 无耻地使用@ D-Shih生成年/月组合的优越方法。
一些解决方案:
WITH ua AS (
SELECT
TO_CHAR(createdate, 'YYYYMM') AS year_month,
COUNT(DISTINCT userid) distinct_users
FROM user_activity
GROUP BY
TO_CHAR(createdate, 'YYYYMM')
)
SELECT * FROM (
SELECT
TO_DATE(ua.year_month || '01', 'YYYYMMDD')
+ INTERVAL '1 month'
- INTERVAL '1 day'
AS month_end,
ua.distinct_users,
LAG(ua.distinct_users) OVER (ORDER BY ua.year_month) distinct_users_last_month
FROM ua
) uas WHERE uas.distinct_users_last_month IS NOT NULL
ORDER BY month_end DESC;
不需要开窗:
WITH ua AS (
SELECT
TO_CHAR(createdate, 'YYYYMM') AS year_month,
TO_CHAR(createdate - INTERVAL '1 MONTH', 'YYYYMM') AS last_month,
COUNT(DISTINCT userid) AS distinct_users
FROM user_activity
GROUP BY
TO_CHAR(createdate, 'YYYYMM'),
TO_CHAR(createdate - INTERVAL '1 MONTH', 'YYYYMM')
)
SELECT
TO_DATE(ua1.year_month || '01', 'YYYYMMDD')
+ INTERVAL '1 month'
- INTERVAL '1 day'
AS month_end,
ua1.distinct_users,
ua2.distinct_users AS last_distinct_users
FROM
ua ua1 LEFT OUTER JOIN ua ua2
ON ua1.year_month = ua2.last_month
WHERE ua2.distinct_users IS NOT NULL
ORDER BY ua1.year_month DESC;
<强> DDL:强>
CREATE TABLE user_activity (
id varchar(50),
userid varchar(50),
createdate timestamp,
username varchar(50)
);
COMMIT;
数据:强>
INSERT INTO user_activity VALUES ('1d658a','4957f3','20161206 21:16:35'::timestamp,'Tom Jones');
INSERT INTO user_activity VALUES ('3a86e3','684edf','20161203 21:16:35'::timestamp,'Harry Smith');
INSERT INTO user_activity VALUES ('595756','582107','20161226 21:16:35'::timestamp,'William Hanson');
INSERT INTO user_activity VALUES ('2c87fe','784723','20161207 21:16:35'::timestamp,'April Cordon');
INSERT INTO user_activity VALUES ('32509a','4957f3','20161220 21:16:35'::timestamp,'Tom Jones');
INSERT INTO user_activity VALUES ('72e703','582107','20170101 21:16:35'::timestamp,'William Hanson');
INSERT INTO user_activity VALUES ('6d658a','582107','20161206 21:16:35'::timestamp,'William Hanson');
INSERT INTO user_activity VALUES ('5c077c','5934c4','20161206 21:16:35'::timestamp,'Sandra Holmes');
INSERT INTO user_activity VALUES ('92142b','57ea5c','20161215 21:16:35'::timestamp,'Lucy Lawless');
INSERT INTO user_activity VALUES ('3dd0a6','5934c4','20161204 21:16:35'::timestamp,'Sandra Holmes');
INSERT INTO user_activity VALUES ('43509a','4957f3','20161120 21:16:35'::timestamp,'Tom Jones');
INSERT INTO user_activity VALUES ('85142b','57ea5c','20161115 21:16:35'::timestamp,'Lucy Lawless');
INSERT INTO user_activity VALUES ('7c87fe','784723','20170107 21:16:35'::timestamp,'April Cordon');
INSERT INTO user_activity VALUES ('9c87fe','784723','20170207 21:16:35'::timestamp,'April Cordo');
COMMIT;
答案 2 :(得分:1)
date_trunc()
最快最简单。使用to_char()
一次以首选格式显示月份:
WITH cte AS (
SELECT date_trunc('month', createdat) AS mon
, count(DISTINCT username) AS ct
FROM activity
GROUP BY 1
)
SELECT to_char(t1.mon, 'MON YYYY') AS month
, t1.ct AS users_this_month
, t2.ct AS users_previous_month
FROM cte t1
LEFT JOIN cte t2 ON t2.mon = t1.mon - interval '1 mon'
ORDER BY t1.mon;
db&lt;&gt;小提琴here
您评论道:
结果表中的“月”字段必须是“日期”数据类型,因此可以对图表进行排序和使用。
为此,只需投射到最终的SELECT
:
SELECT t1.mon::date AS month ...
通过(截断的)timestamp
值进行分组和排序比通过多个值或text
表示更有效(和可靠)。
结果包括第一个月(演示中的“NOV 2016”),显示NULL
的{{1}} - 与上个月没有参赛作品一样。您可能希望显示users_previous_month
或删除行...
相关:
除此之外:“Tom Jones”形式的用户名通常不是唯一的。您需要使用唯一ID进行操作。