如何显示从一周到另一周的按组别细分的登录更改

时间:2019-02-03 00:30:06

标签: sql sql-server partitioning

目标:我必须按订阅的月份对电子邮件进行细分,这将确定同类群组。换句话说,2018年1月订阅的每个人都在一个队列中,2018年2月在另一个队列中。然后,我需要查看他们从一个星期到另一个星期的登录活动。如果2018年1月同类群组的100位订阅者登录了2019年的ISO_WEEK 2,而其中70位登录了ISO_WEEK 3,则保留率为70%。

问题:我不确定如何编写查询以将同类群组(例如Jan2018,Feb2018,Mar2018)作为第一列,而以下各列是不同电子邮件的登录计数从2019年开始按照ISO_WEEK进行活动。

样本数据:

CREATE TABLE member
    ([email] varchar(50), [creation_date] Datetime)
INSERT INTO member
VALUES
    ('player123@google.com', '2018-01-01 05:00:00'),
    ('player999@google.com', '2018-01-30 12:00:00'),
    ('player555@google.com', '2018-05-14 20:15:00')
CREATE TABLE login
    ([email] varchar(100), [login_date] Datetime)
INSERT INTO login
VALUES
    ('player123@google.com', '2019-01-07 05:30:00'),
    ('player123@google.com', '2019-01-07 09:30:00'),
    ('player123@google.com', '2019-01-08 08:30:00'),
    ('player123@google.com', '2019-01-15 06:30:00'),
    ('player999@google.com', '2019-01-08 11:30:00'),
    ('player999@google.com', '2019-01-10 07:30:00'),
    ('player555@google.com', '2019-01-08 04:30:00')

我尝试过的事情:

;with
cte1 AS (
    SELECT CAST(Creation_Date AS Date) AS Creation_Date
        ,CONCAT(DATEPART(MONTH,Creation_Date),'-',DATEPART(YEAR,Creation_Date)) AS Cohort
        ,email AS Emails
    FROM member
        ),
cte2 AS (
    SELECT Logins
        ,yy
        ,login_ISOWeeks
        ,Emails
    FROM (
        SELECT CAST(login_date as Date) AS Logins
            ,DATEPART(YEAR, login_date) AS yy
            ,DATEPART(ISO_WEEK,login_date) AS login_ISOWeeks
            ,email AS Emails
            ,ROW_NUMBER()
                OVER(PARTITION BY DATEPART(YEAR, login_date), DATEPART(ISO_WEEK,login_date), email ORDER BY login_date ASC) AS week_count
        FROM login) as f_log
    WHERE f_log.week_count = 1
        )

SELECT cte1.Creation_Date
    ,cte1.Cohort
    ,cte2.yy
    ,cte2.login_ISOWeeks
    ,cte1.Emails
FROM cte1
INNER JOIN cte2 ON cte1.Emails=cte2.Emails

所需的输出:

Cohort   2019_2  2019_3
jan 2018    2      1
may 2018    1      0    

1 个答案:

答案 0 :(得分:2)

您的数据有很多怪异之处。为什么join键是电子邮件地址而不是会员ID?为什么多次“创建”电子邮件成员?

为了防止联接失控,我在进行联接之前汇总了每个表。这样会产生您想要的结果:

select datename(year, m.creation_date) + '-' + datename(month, m.creation_date) as yyyymm,
       count(distinct m.email) as num_members,
       sum(case when l.yyyy = 2019 and l.isoweek = 2 then 1 else 0 end) as cnt_201902,
       sum(case when l.yyyy = 2019 and l.isoweek = 3 then 1 else 0 end) as cnt_201903
from (select m.email, min(creation_date) as creation_date
      from member m
      group by m.email
     ) m left join
     (select distinct l.email, year(l.login_date) as yyyy, datepart(iso_week, l.login_date) as isoweek
      from login l
     ) l
     on m.email = l.email
group by datename(year, m.creation_date) + '-' + datename(month, m.creation_date) 
order by yyyymm;

Here是db <>小提琴。