我对此查询有疑问。
SELECT event_date, country, COUNT(*) AS sessions,
AVG(length) AS average_session_length
FROM (
SELECT country, event_date, global_session_id,
(MAX(event_timestamp) - MIN(event_timestamp))/(60 * 1000 * 1000) AS length
FROM (
SELECT user_pseudo_id,
event_timestamp,
country,
event_date,
SUM(is_new_session) OVER (ORDER BY user_pseudo_id, event_timestamp) AS global_session_id,
SUM(is_new_session) OVER (PARTITION BY user_pseudo_id ORDER BY event_timestamp) AS user_session_id
FROM (
SELECT *,
CASE WHEN event_timestamp - last_event >= (30*60*1000*1000)
OR last_event IS NULL
THEN 1 ELSE 0 END AS is_new_session
FROM (
SELECT user_pseudo_id,
event_timestamp,
geo.country,
event_date,
LAG(event_timestamp,1) OVER (PARTITION BY user_pseudo_id ORDER BY event_timestamp) AS last_event
FROM `xxx.events*`
) last
) final
) session
GROUP BY global_session_id, country, event_date
) agg
WHERE length >= (10/60)
group by country, event_date
Google Cloud Console给出了错误
Resources exceeded during query execution: The query could not be executed in the allotted memory.
我知道OVER
子句可能是一个问题,但是我不知道如何编辑查询来获得相同的结果。
感谢您的帮助。
谢谢你们!
答案 0 :(得分:1)
如果我不得不猜测,就是这一行:
SUM(is_new_session) OVER (ORDER BY user_pseudo_id, event_timestamp) AS global_session_id,
我建议更改代码,以使“全局”会话ID真正对于每个用户而言都是本地的:
SUM(is_new_session) OVER (PARTITION BY user_pseudo_id ORDER BY event_timestamp) AS global_session_id,
如果您调整查询并使其基本起作用,则资源问题将得到解决。下一步是弄清楚如何获取所需的全局ID。最简单的解决方案是为每个用户使用本地ID。