我有一张桌子"事件"由2列组成:
userId | eventDate
-------+-------------------
s234124| 2015-01-01
a2s3166| 2015-01-02
c216782| 2015-01-03
z312235| 2015-01-04
userId是用户ID。 eventDate表示该用户发生事件的日期。
我想每天计算在该日期结束的30(或7或60等)日期间的有效唯一身份用户数。活动的唯一用户被定义为在给定窗口期间至少有一个事件的userId。
我阅读了this文章,该文章描述了一个类似的问题,但我无法根据我的用例进行调整。
答案 0 :(得分:6)
假设您的表格中有两个文件用户ID 和日期 - dataset.your_table
SELECT
date,
SUM(CASE WHEN period = 7 THEN users END) as days_07,
SUM(CASE WHEN period = 14 THEN users END) as days_14,
SUM(CASE WHEN period = 30 THEN users END) as days_30
FROM (
SELECT
dates.date as date,
periods.period as period,
EXACT_COUNT_DISTINCT(activity.userid) as users
FROM dataset.your_table as activity
CROSS JOIN (SELECT date FROM dataset.your_table GROUP BY date) as dates
CROSS JOIN (SELECT period FROM (SELECT 7 as period),
(SELECT 14 as period), (SELECT 30 as period)) as periods
WHERE dates.date >= activity.date
AND INTEGER(FLOOR(DATEDIFF(dates.date, activity.date)/periods.period)) = 0
GROUP BY 1,2
)
GROUP BY date
ORDER BY date DESC
结果如下所示
date days_07 days_14 days_30
8/29/2015 2,468,649 3,597,684 7,180,175
8/28/2015 2,472,342 3,592,680 6,969,581
8/27/2015 2,486,979 3,595,822 6,745,625
8/26/2015 2,507,572 3,576,816 6,494,710
8/25/2015 2,508,036 3,553,386 6,264,950
8/24/2015 2,511,946 3,521,184 6,024,151
8/23/2015 2,488,485 3,482,163 5,774,763
8/22/2015 2,474,526 3,450,719 5,547,318
8/21/2015 2,463,568 3,422,003 5,327,760