计算每个用户在一段时间内最常触发的事件

时间:2017-06-30 08:28:27

标签: google-bigquery

我从过去几周内触发特定Google Analytics事件的网站上提取了一个UserID列表。我目前使用下面的查询,每个用户标识返回一行,每个事件标签。

我想通过计算此期间每个用户ID最常点击的事件来增强此查询,并仅返回此事件,而不是为其触发的所有事件。任何人都可以建议一个很好的方法来实现这个目标吗?

SELECT customDimension.value AS UserID, hits.eventinfo.eventAction AS Size
FROM `*.ga_sessions_*` AS t
  CROSS JOIN UNNEST(hits) AS hits
  CROSS JOIN UNNEST(t.customdimensions) AS customDimension
WHERE (_TABLE_SUFFIX BETWEEN '20170601' AND '20170628')
AND (hits.page.pagePath LIKE "%/shorts%" OR hits.page.pagePath LIKE "%/t-shirts%")
AND hits.eventinfo.eventCategory = "SIZE Filter Click"
AND (hits.eventinfo.eventAction = "S" OR hits.eventinfo.eventAction = "M" OR hits.eventinfo.eventAction = "L" OR hits.eventinfo.eventAction = "XL")
AND customDimension.index = 2
GROUP BY UserID, Size

1 个答案:

答案 0 :(得分:1)

我认为这可能适合你:

WITH data AS(
  select ARRAY<STRUCT<index INT64, value STRING>> [STRUCT(NULL as index, '' as value), STRUCT(0 as index, 'test' as value), STRUCT(2 as index, 'user_1' as value)] customDimensions,  ARRAY<STRUCT<page STRUCT<pagePath STRING>, eventinfo STRUCT<eventcategory STRING, eventaction STRING> >> [STRUCT(STRUCT('/home' as pagePath) as page, STRUCT("cat1" as eventcategory, "act1" as eventaction) as eventinfo), STRUCT(STRUCT('/abcshortsabc' as pagePath) as page, STRUCT("SIZE Filter Click" as eventcategory, "S" as eventaction) as eventinfo), STRUCT(STRUCT('/abcshortsabc' as pagePath) as page, STRUCT("SIZE Filter Click" as eventcategory, "S" as eventaction) as eventinfo), STRUCT(STRUCT('/abcshortsabc' as pagePath) as page, STRUCT("SIZE Filter Click" as eventcategory, "M" as eventaction) as eventinfo), STRUCT(STRUCT('/abc/t-shirtsabc' as pagePath) as page, STRUCT("SIZE Filter Click" as eventcategory, "L" as eventaction) as eventinfo)] hits union all
  select ARRAY<STRUCT<index INT64, value STRING>> [STRUCT(2 as index, 'user_2' as value)] customDimensions,  ARRAY<STRUCT<page STRUCT<pagePath STRING>, eventinfo STRUCT<eventcategory STRING, eventaction STRING> >> [STRUCT(STRUCT('shorts' as pagePath) as page, STRUCT("cat1" as eventcategory, "act1" as eventaction) as eventinfo), STRUCT(STRUCT('/abcshortsabc' as pagePath) as page, STRUCT("SIZE Filter Click" as eventcategory, "M" as eventaction) as eventinfo), STRUCT(STRUCT('/abcshortsabc' as pagePath) as page, STRUCT("SIZE Filter Click" as eventcategory, "M" as eventaction) as eventinfo), STRUCT(STRUCT('/abcshortsabc' as pagePath) as page, STRUCT("SIZE Filter Click" as eventcategory, "M" as eventaction) as eventinfo), STRUCT(STRUCT('/abc/t-shirtsabc' as pagePath) as page, STRUCT("SIZE Filter Click" as eventcategory, "L" as eventaction) as eventinfo)] hits
)

SELECT 
  (SELECT value FROM UNNEST(customDimensions) WHERE index = 2 GROUP BY value) UserID,
  (SELECT eventinfo.eventaction size FROM UNNEST(hits) WHERE (REGEXP_CONTAINS(page.pagepath, r'shorts') OR REGEXP_CONTAINS(page.pagepath, r'/t-shirts')) AND eventinfo.eventcategory = 'SIZE Filter Click' GROUP BY eventinfo.eventaction ORDER BY COUNT(eventinfo.eventaction) DESC LIMIT 1) most_clicked_size
FROM data

如果您还要检索最常用标签的总点击次数(在您的情况下为操作),您还可以执行以下操作:

SELECT 
  (SELECT value FROM UNNEST(customDimensions) WHERE index = 2 GROUP BY value) UserID,
  (SELECT AS STRUCT eventinfo.eventaction size, count(1) freq FROM UNNEST(hits) WHERE (REGEXP_CONTAINS(page.pagepath, r'shorts') OR REGEXP_CONTAINS(page.pagepath, r'/t-shirts')) AND eventinfo.eventcategory = 'SIZE Filter Click' GROUP BY eventinfo.eventaction ORDER BY COUNT(eventinfo.eventaction) DESC LIMIT 1) most_clicked_size
FROM data

其中data是您实际的会话数据的模拟。