查询以获取每个月和上个月的用户计数

时间:2018-06-16 21:10:56

标签: sql postgresql datetime aggregate

我们有以下活动表,并希望查询它以获取每个月和上个月的唯一身份用户数。日期字段(createdat)是timestamp。查询需要在PostgreSQL中工作。

活动表:

|   id   | userid |        createdat        |    username    |
|--------|--------|-------------------------|----------------|
| 1d658a | 4957f3 | 2016-12-06 21:16:35:942 | Tom Jones      |
| 3a86e3 | 684edf | 2016-12-03 21:16:35:943 | Harry Smith    |
| 595756 | 582107 | 2016-12-26 21:16:35:944 | William Hanson |
| 2c87fe | 784723 | 2016-12-07 21:16:35:945 | April Cordon   |
| 32509a | 4957f3 | 2016-12-20 21:16:35:946 | Tom Jones      |
| 72e703 | 582107 | 2017-01-01 21:16:35:947 | William Hanson |
| 6d658a | 582107 | 2016-12-06 21:16:35:948 | William Hanson |
| 5c077c | 5934c4 | 2016-12-06 21:16:35:949 | Sandra Holmes  |
| 92142b | 57ea5c | 2016-12-15 21:16:35:950 | Lucy Lawless   |
| 3dd0a6 | 5934c4 | 2016-12-04 21:16:35:951 | Sandra Holmes  |
| 43509a | 4957f3 | 2016-11-20 21:16:35:946 | Tom Jones      |
| 85142b | 57ea5c | 2016-11-15 21:16:35:950 | Lucy Lawless   |
| 7c87fe | 784723 | 2017-1-07 21:16:35:945  | April Cordon   |
| 9c87fe | 784723 | 2017-2-07 21:16:35:946  | April Cordon   |

结果:

|  Month   | UserThis Month | UserPreviousMonth |
|----------|----------------|-------------------|
| Dec 2016 |              6 |                 2 |
| Jan 2017 |              2 |                 6 |
| Feb 2017 |              1 |                 2 |

3 个答案:

答案 0 :(得分:1)

您可以尝试此查询。 to_char获取MON YYYY,您可以尝试使用lag windows函数编写子查询以获得UserPreviousMonth次数。

SELECT * 
FROM   (SELECT To_char(createdat, 'MON YYYY') Months, 
               Count(DISTINCT username) UserThisMonth, 
               Lag(Count(DISTINCT username)) OVER ( 
                   ORDER BY Date_part('year', createdat), 
                            Date_part('month',createdat) 
                 ) UserPreviousMonth 
        FROM   t 
        GROUP  BY Date_part('year', createdat), 
                  To_char(createdat, 'MON YYYY'), 
                  Date_part('month', createdat)) t 
WHERE  userpreviousmonth IS NOT NULL 

sqlfiddle:http://sqlfiddle.com/#!15/45e52/2

|   months | userthismonth | userpreviousmonth |
|----------|---------------|-------------------|
| DEC 2016 |             6 |                 2 |
| JAN 2017 |             2 |                 6 |
| FEB 2017 |             1 |                 2 |

修改

Dec 2016Jan 2017 ...的类型必须为字符串,因为DateTime需要完整日期,例如2017-01-01。如果您需要对图表进行排序和使用,我建议您对此查询yearsmonths列进行排序,然后在前端创建日期字符串。

SELECT * 
FROM   (SELECT Date_part('year', createdat) years, 
               Date_part('month', createdat) months,
               Count(DISTINCT username) UserThisMonth, 
               Lag(Count(DISTINCT username)) OVER ( 
                   ORDER BY Date_part('year', createdat), 
                            Date_part('month',createdat) 
                 ) UserPreviousMonth 
        FROM  user_activity 
        GROUP  BY Date_part('year', createdat), 
                  Date_part('month', createdat)) t 
WHERE  userpreviousmonth IS NOT NULL 

sqlfiddle:http://sqlfiddle.com/#!15/2da2b/4

| years | months | userthismonth | userpreviousmonth |
|-------|--------|---------------|-------------------|
|  2016 |     12 |             6 |                 2 |
|  2017 |      1 |             2 |                 6 |
|  2017 |      2 |             1 |                 2 |

答案 1 :(得分:1)

修改 无耻地使用@ D-Shih生成年/月组合的优越方法。

一些解决方案:

WITH ua AS (
  SELECT 
    TO_CHAR(createdate, 'YYYYMM') AS year_month,
    COUNT(DISTINCT userid) distinct_users
  FROM user_activity
  GROUP BY
    TO_CHAR(createdate, 'YYYYMM')
)
SELECT * FROM (
  SELECT 
    TO_DATE(ua.year_month || '01', 'YYYYMMDD') 
        + INTERVAL '1 month' 
        - INTERVAL '1 day' 
    AS month_end,
    ua.distinct_users,
    LAG(ua.distinct_users) OVER (ORDER BY ua.year_month) distinct_users_last_month
  FROM ua
) uas WHERE uas.distinct_users_last_month IS NOT NULL
ORDER BY month_end DESC;

不需要开窗:

WITH ua AS (
  SELECT 
    TO_CHAR(createdate, 'YYYYMM') AS year_month,
    TO_CHAR(createdate - INTERVAL '1 MONTH', 'YYYYMM') AS last_month,
    COUNT(DISTINCT userid) AS distinct_users
  FROM user_activity
  GROUP BY
    TO_CHAR(createdate, 'YYYYMM'),
    TO_CHAR(createdate - INTERVAL '1 MONTH', 'YYYYMM')
)
SELECT 
  TO_DATE(ua1.year_month || '01', 'YYYYMMDD') 
        + INTERVAL '1 month' 
        - INTERVAL '1 day' 
    AS month_end,
  ua1.distinct_users,
  ua2.distinct_users AS last_distinct_users
FROM 
  ua ua1 LEFT OUTER JOIN ua ua2 
    ON ua1.year_month = ua2.last_month
WHERE ua2.distinct_users IS NOT NULL
ORDER BY ua1.year_month DESC;

<强> DDL:

CREATE TABLE user_activity (
  id varchar(50),
  userid varchar(50),
  createdate timestamp,
  username varchar(50)
);
COMMIT;

数据:

INSERT INTO user_activity VALUES ('1d658a','4957f3','20161206 21:16:35'::timestamp,'Tom Jones');
INSERT INTO user_activity VALUES ('3a86e3','684edf','20161203 21:16:35'::timestamp,'Harry Smith');
INSERT INTO user_activity VALUES ('595756','582107','20161226 21:16:35'::timestamp,'William Hanson');
INSERT INTO user_activity VALUES ('2c87fe','784723','20161207 21:16:35'::timestamp,'April Cordon');
INSERT INTO user_activity VALUES ('32509a','4957f3','20161220 21:16:35'::timestamp,'Tom Jones');
INSERT INTO user_activity VALUES ('72e703','582107','20170101 21:16:35'::timestamp,'William Hanson');
INSERT INTO user_activity VALUES ('6d658a','582107','20161206 21:16:35'::timestamp,'William Hanson');
INSERT INTO user_activity VALUES ('5c077c','5934c4','20161206 21:16:35'::timestamp,'Sandra Holmes');
INSERT INTO user_activity VALUES ('92142b','57ea5c','20161215 21:16:35'::timestamp,'Lucy Lawless');
INSERT INTO user_activity VALUES ('3dd0a6','5934c4','20161204 21:16:35'::timestamp,'Sandra Holmes');
INSERT INTO user_activity VALUES ('43509a','4957f3','20161120 21:16:35'::timestamp,'Tom Jones');
INSERT INTO user_activity VALUES ('85142b','57ea5c','20161115 21:16:35'::timestamp,'Lucy Lawless');
INSERT INTO user_activity VALUES ('7c87fe','784723','20170107 21:16:35'::timestamp,'April Cordon');
INSERT INTO user_activity VALUES ('9c87fe','784723','20170207 21:16:35'::timestamp,'April Cordo');
COMMIT;

答案 2 :(得分:1)

date_trunc()最快最简单。使用to_char()一次以首选格式显示月份:

WITH cte AS (
   SELECT date_trunc('month', createdat) AS mon
        , count(DISTINCT username) AS ct
   FROM   activity
   GROUP  BY 1
   )
SELECT to_char(t1.mon, 'MON YYYY') AS month
     , t1.ct AS users_this_month
     , t2.ct AS users_previous_month
FROM        cte t1
LEFT   JOIN cte t2 ON t2.mon = t1.mon - interval '1 mon'
ORDER  BY t1.mon;

db&lt;&gt;小提琴here

您评论道:

  

结果表中的“月”字段必须是“日期”数据类型,因此可以对图表进行排序和使用。

为此,只需投射到最终的SELECT

SELECT t1.mon::date AS month ...

通过(截断的)timestamp值进行分组和排序比通过多个值或text表示更有效(和可靠)。

结果包括第一个月(演示中的“NOV 2016”),显示NULL的{​​{1}} - 与上个月没有参赛作品一样。您可能希望显示users_previous_month或删除行...

相关:

除此之外:“Tom Jones”形式的用户名通常不是唯一的。您需要使用唯一ID进行操作。