我有一些用户数据,如下所示,我希望每天看到的唯一身份用户数总计。从基本查询开始:
SELECT
day, user_id, COUNT(DISTINCT(user_id)) AS cnt
FROM
(select "A" user_id, "2015-02-01" day),
(select "A" user_id, "2015-02-01" day),
(select "A" user_id, "2015-02-01" day),
(select "A" user_id, "2015-02-01" day),
(select "A" user_id, "2015-02-01" day),
(select "A" user_id, "2015-02-01" day),
(select "B" user_id, "2015-02-01" day),
(select "B" user_id, "2015-02-02" day),
(select "B" user_id, "2015-02-02" day),
(select "B" user_id, "2015-02-02" day),
(select "C" user_id, "2015-02-01" day),
(select "C" user_id, "2015-02-02" day),
(select "D" user_id, "2015-02-04" day)
GROUP BY
day, user_id
该小组的结果是:
Row day user_id cnt
1 2015-02-01 A 1
2 2015-02-01 B 1
3 2015-02-02 B 1
4 2015-02-01 C 1
5 2015-02-02 C 1
6 2015-02-04 D 1
我可以看到2015-02-01
上有三个唯一身份用户,而2015-02-04
之前没有新用户,只有一个用户(用户D)。
我需要将结果看起来像这样:
Row day running_count
1 2015-02-01 3
2 2015-02-02 3
3 2015-02-03 3
3 2015-02-04 4
running_count
对应于每天新用户数量的运行记录。例如,2015-02-02
为零,因为只有user_id的B& C出现在当天,但已经计入2015-02-01
。
提前感谢您的帮助。
答案 0 :(得分:1)
仅查看MIN(日期),SUM()OVER()以查看运行计数。它将缺少临时日期,但您可以通过LEFT JOIN
获得SELECT day, SUM(c) OVER(ORDER BY day)
FROM (
SELECT day, COUNT(DISTINCT user_id) c
FROM (
SELECT MIN(day) day, user_id
FROM
(select "A" user_id, "2015-02-01" day),
(select "A" user_id, "2015-02-01" day),
(select "A" user_id, "2015-02-01" day),
(select "A" user_id, "2015-02-01" day),
(select "A" user_id, "2015-02-01" day),
(select "A" user_id, "2015-02-01" day),
(select "B" user_id, "2015-02-01" day),
(select "B" user_id, "2015-02-02" day),
(select "B" user_id, "2015-02-02" day),
(select "B" user_id, "2015-02-02" day),
(select "C" user_id, "2015-02-01" day),
(select "C" user_id, "2015-02-02" day),
(select "D" user_id, "2015-02-04" day)
GROUP BY user_id
)
GROUP BY day
)