我有一个BQ表,user_events类似于以下内容:
proc ArithmeticMean { xl } {
数据针对数百万用户,针对不同的活动日期。
我想写一个查询,它会为我提供过去30天内活跃的每一天的用户列表。
这仅在当天为我提供了完全独特的用户;我不能让它给我每个日期的最后30个。感谢帮助。
event_date | user_id | event_type
答案 0 :(得分:2)
下面是BigQuery Standard SQL,对你的案例几乎没有假设:
如果以上有意义 - 见下文
#standardSQL
SELECT
user_id, event_date
FROM (
SELECT
user_id, event_date,
(COUNT(1)
OVER(PARTITION BY user_id
ORDER BY UNIX_DATE(event_date)
RANGE BETWEEN 30 PRECEDING AND 1 PRECEDING)
) >= 5 AS activity
FROM `yourTable`
)
WHERE activity
GROUP BY user_id, event_date
-- ORDER BY event_date
如果上述假设#1不正确 - 你可以简单地添加预分组作为子选择
#standardSQL
SELECT
user_id, event_date
FROM (
SELECT
user_id, event_date,
(COUNT(1)
OVER(PARTITION BY user_id
ORDER BY UNIX_DATE(event_date)
RANGE BETWEEN 30 PRECEDING AND 1 PRECEDING)
) >= 5 AS activity
FROM (
SELECT user_id, event_date
FROM `yourTable`
GROUP BY user_id, event_date
)
)
WHERE activity
GROUP BY user_id, event_date
-- ORDER BY event_date
更新
来自评论:如果用户有任何event_type IN('查看','转换',' productDetail','搜索' ),他们将被认为是活跃的。这意味着在应用程序中触发了任何类型的事件
所以,你可以选择下面,我想
#standardSQL
SELECT
user_id, event_date
FROM (
SELECT
user_id, event_date,
(COUNT(1)
OVER(PARTITION BY user_id
ORDER BY UNIX_DATE(event_date)
RANGE BETWEEN 30 PRECEDING AND 1 PRECEDING)
) >= 5 AS activity
FROM (
SELECT user_id, event_date
FROM `yourTable`
WHERE event_type IN ('view', 'conversion', 'productDetail', 'search')
GROUP BY user_id, event_date
)
)
WHERE activity
GROUP BY user_id, event_date
-- ORDER BY event_date