我想使用Google Analytics数据源表计算BigQuery中每页的网页浏览量。我只想计算自定义页面内容分组为ProductList_UA
或ProductDetails_UA
的网页,并且我想修剪页面网址末尾的所有参数,以便返回更易于管理的网页列表
到目前为止,我的查询如下所示,但我的网页浏览量,退回和退出的数量太高(大约8倍) - 我哪里出错?
SELECT IFNULL(REGEXP_EXTRACT(hits.page.pagePath,r'^(.*?)\?'), hits.page.pagePath) AS Trimmed_Page, COUNT(hits.page.pagepath) AS Pageviews, SUM(totals.bounces) AS Bounces, SUM(IF(hits.isexit = TRUE, 1,0)) AS Exits, SUM(IF(hits.isentrance = TRUE, 1,0)) AS Entrances, MIN(hits.contentGroup.contentGroup3) AS Content_Group
FROM `xxx.ga_sessions_20*` AS m
CROSS JOIN UNNEST(m.customdimensions) AS customDimension
CROSS JOIN UNNEST(m.hits) AS hits
WHERE parse_date('%y%m%d', _table_suffix) between
DATE_sub(current_date(), interval 1 day) and
DATE_sub(current_date(), interval 1 day)
AND (hits.contentGroup.contentGroup3 = 'ProductList_UA' OR hits.contentGroup.contentGroup3 = 'ProductDetails_UA')
AND hits.type="PAGE"
AND hits.isInteraction = TRUE
GROUP BY Trimmed_Page
ORDER BY Pageviews DESC
LIMIT 1000
答案 0 :(得分:1)
我怀疑与customDimensions的交叉连接是您看到比预期更多结果的原因,因为每行匹配将乘以该行中的customDimensions数。没有交叉连接的实验,看它是否解决了这个问题。