我尝试使用SQL创建同类群组查询。 通常使用群组分析,我们会查看用户并检查是否在特定时间执行特定操作的用户,并计算该用户是否执行相同的操作。
WITH by_week
AS (SELECT
user_id,
TD_DATE_TRUNC('week', login_time) AS login_week
FROM logins
GROUP BY 1, 2),
with_first_week
AS (SELECT
user_id,
login_week,
FIRST_VALUE(login_week) OVER (PARTITION BY user_id ORDER BY login_week) AS first_week
FROM by_week),
with_week_number
AS (SELECT
user_id,
login_week,
first_week,
(login_week - first_week) / (24 * 60 * 60 * 7) AS week_number
FROM with_first_week)
SELECT
TD_TIME_FORMAT(first_week, 'yyyy-MM-dd') AS first_week,
SUM(CASE WHEN week_number = 1 THEN 1 ELSE 0 END) AS week_1,
SUM(CASE WHEN week_number = 2 THEN 1 ELSE 0 END) AS week_2,
SUM(CASE WHEN week_number = 3 THEN 1 ELSE 0 END) AS week_3,
SUM(CASE WHEN week_number = 4 THEN 1 ELSE 0 END) AS week_4,
SUM(CASE WHEN week_number = 5 THEN 1 ELSE 0 END) AS week_5,
SUM(CASE WHEN week_number = 6 THEN 1 ELSE 0 END) AS week_6,
SUM(CASE WHEN week_number = 7 THEN 1 ELSE 0 END) AS week_7,
SUM(CASE WHEN week_number = 8 THEN 1 ELSE 0 END) AS week_8,
SUM(CASE WHEN week_number = 9 THEN 1 ELSE 0 END) AS week_9
FROM with_week_number
GROUP BY 1
ORDER BY 1
但是现在说,我不太关心第一次/用户级分析,我只想知道我的登录操作是否会随着时间的推移而增加(即我想在第一个队列中添加登录信息)第2周,第1周的第二组登录)。有没有简单/优雅的方法来做到这一点?
编辑:
在下面给出一个例子
WeekStart Week1 Week2 Week 3
2017/05/03 66 **53** **49**
2017/05/10 (**53**+74) (**49**+70) **65**
2017/05/17 (**49**+ 70 + 45) (**65** + 80) etc.
答案 0 :(得分:1)
我认为您需要按login_week
而不是first_week
进行分组,因此您计算每一行中给定周内的所有登录,而不是群组,然后您必须使用>=
而不是=
所以它将总结本周的队列与所有给定行中的所有年龄组。
WITH
by_week AS (
SELECT
user_id,
TD_DATE_TRUNC('week', login_time) AS login_week
FROM logins
GROUP BY 1, 2
)
,with_first_week AS (
SELECT
user_id,
login_week,
FIRST_VALUE(login_week) OVER (PARTITION BY user_id ORDER BY login_week) AS first_week
FROM by_week
)
,with_week_number AS (
SELECT
user_id,
login_week,
first_week,
(login_week - first_week) / (24 * 60 * 60 * 7) AS week_number
FROM with_first_week
)
SELECT
TD_TIME_FORMAT(login_week, 'yyyy-MM-dd') AS login_week,
SUM(CASE WHEN week_number>= 1 THEN 1 ELSE 0 END) AS week_1,
SUM(CASE WHEN week_number>= 2 THEN 1 ELSE 0 END) AS week_2,
SUM(CASE WHEN week_number>= 3 THEN 1 ELSE 0 END) AS week_3,
SUM(CASE WHEN week_number>= 4 THEN 1 ELSE 0 END) AS week_4,
SUM(CASE WHEN week_number>= 5 THEN 1 ELSE 0 END) AS week_5,
SUM(CASE WHEN week_number>= 6 THEN 1 ELSE 0 END) AS week_6,
SUM(CASE WHEN week_number>= 7 THEN 1 ELSE 0 END) AS week_7,
SUM(CASE WHEN week_number>= 8 THEN 1 ELSE 0 END) AS week_8,
SUM(CASE WHEN week_number>= 9 THEN 1 ELSE 0 END) AS week_9
FROM with_week_number
GROUP BY 1
ORDER BY 1;