目标:我必须按订阅的月份对电子邮件进行细分,这将确定同类群组。换句话说,2018年1月订阅的每个人都在一个队列中,2018年2月在另一个队列中。然后,我需要查看他们从一个星期到另一个星期的登录活动。如果2018年1月同类群组的100位订阅者登录了2019年的ISO_WEEK 2,而其中70位登录了ISO_WEEK 3,则保留率为70%。
问题:我不确定如何编写查询以将同类群组(例如Jan2018,Feb2018,Mar2018)作为第一列,而以下各列是不同电子邮件的登录计数从2019年开始按照ISO_WEEK进行活动。
样本数据:
CREATE TABLE member
([email] varchar(50), [creation_date] Datetime)
INSERT INTO member
VALUES
('player123@google.com', '2018-01-01 05:00:00'),
('player999@google.com', '2018-01-30 12:00:00'),
('player555@google.com', '2018-05-14 20:15:00')
CREATE TABLE login
([email] varchar(100), [login_date] Datetime)
INSERT INTO login
VALUES
('player123@google.com', '2019-01-07 05:30:00'),
('player123@google.com', '2019-01-07 09:30:00'),
('player123@google.com', '2019-01-08 08:30:00'),
('player123@google.com', '2019-01-15 06:30:00'),
('player999@google.com', '2019-01-08 11:30:00'),
('player999@google.com', '2019-01-10 07:30:00'),
('player555@google.com', '2019-01-08 04:30:00')
我尝试过的事情:
;with
cte1 AS (
SELECT CAST(Creation_Date AS Date) AS Creation_Date
,CONCAT(DATEPART(MONTH,Creation_Date),'-',DATEPART(YEAR,Creation_Date)) AS Cohort
,email AS Emails
FROM member
),
cte2 AS (
SELECT Logins
,yy
,login_ISOWeeks
,Emails
FROM (
SELECT CAST(login_date as Date) AS Logins
,DATEPART(YEAR, login_date) AS yy
,DATEPART(ISO_WEEK,login_date) AS login_ISOWeeks
,email AS Emails
,ROW_NUMBER()
OVER(PARTITION BY DATEPART(YEAR, login_date), DATEPART(ISO_WEEK,login_date), email ORDER BY login_date ASC) AS week_count
FROM login) as f_log
WHERE f_log.week_count = 1
)
SELECT cte1.Creation_Date
,cte1.Cohort
,cte2.yy
,cte2.login_ISOWeeks
,cte1.Emails
FROM cte1
INNER JOIN cte2 ON cte1.Emails=cte2.Emails
所需的输出:
Cohort 2019_2 2019_3
jan 2018 2 1
may 2018 1 0
答案 0 :(得分:2)
您的数据有很多怪异之处。为什么join
键是电子邮件地址而不是会员ID?为什么多次“创建”电子邮件成员?
为了防止联接失控,我在进行联接之前汇总了每个表。这样会产生您想要的结果:
select datename(year, m.creation_date) + '-' + datename(month, m.creation_date) as yyyymm,
count(distinct m.email) as num_members,
sum(case when l.yyyy = 2019 and l.isoweek = 2 then 1 else 0 end) as cnt_201902,
sum(case when l.yyyy = 2019 and l.isoweek = 3 then 1 else 0 end) as cnt_201903
from (select m.email, min(creation_date) as creation_date
from member m
group by m.email
) m left join
(select distinct l.email, year(l.login_date) as yyyy, datepart(iso_week, l.login_date) as isoweek
from login l
) l
on m.email = l.email
group by datename(year, m.creation_date) + '-' + datename(month, m.creation_date)
order by yyyymm;
Here是db <>小提琴。