使用BigQuery进行网站搜索分析

时间:2020-03-12 10:10:43

标签: google-analytics google-bigquery

我是BigQuery导出架构的新手,正在尝试分析电子商务网站中的网站搜索模式。

一旦有人使用站点中的搜索栏,我想获得每个会话执行的步骤。我想检索使用的关键字和网址(也就是searchPath,previousSearchPath和nextSearchPath)。

对此,我想检查一下:

  • 当用户(n次点击/互动后)考虑在我们的网站中搜索
  • 搜索前他/她访问的最后一个网址
  • 搜索后他访问的结果页面
  • 以下操作

到目前为止,我得到了这个查询。但是,prev_page_path / step / nextStep与某些逻辑不匹配。我得到类似的东西:

  • product-list-page /
  • product-list-page / 2
  • product-list-page / 3
SELECT
  CONCAT(fullVisitorId, CAST(visitStartTime AS STRING)) AS sessionId,
  IF(cd.index=7,cd.value,NULL) AS customDimension7,
  IF(cd.index=179,cd.value,NULL) AS customDimension179,
  h.contentGroup.contentGroup2 AS CG,
  h.hitNumber AS hitNumber,
  h.page.searchKeyword AS keyword,
  LAG(h.page.pagePath, 1) OVER (PARTITION BY CONCAT(fullVisitorId, CAST(visitStartTime AS STRING)) ORDER BY h.hitNumber) AS prev_page_path,
  h.page.pagePath AS step,
  ROW_NUMBER() OVER (PARTITION BY CONCAT(fullVisitorId, CAST(visitStartTime AS STRING)) ORDER BY hitNumber ASC) AS rowNumber,
  LEAD(h.page.pagePath, 1) OVER (PARTITION BY CONCAT(fullVisitorId, CAST(visitStartTime AS STRING)) ORDER BY hitNumber ASC) AS nextStep
FROM
  `xxx-xx-xxx.xxxxxxxxxxxx.ga_sessions_*`,
  UNNEST(hits) AS h,
  UNNEST(h.customDimensions) AS cd
WHERE
  _TABLE_SUFFIX = FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
  AND cd.index = 12
  AND h.page.searchKeyword IS NOT NULL
  AND cd.value >= '1'
LIMIT
  100

1 个答案:

答案 0 :(得分:0)

我终于设法为网站搜索跟踪添加了一个事件。因此,我现在拥有触发该事件的会话,以及使用hit.page.searchKeywords的结果以及计算所需指标所需的所有其他信息。

WITH event AS (
  SELECT
      CONCAT(fullVisitorId, visitStartTime) as sessionId,
      CASE WHEN hit.eventInfo.eventAction= 'sitesearch' THEN hit.page.pagePath ELSE NULL END AS pagepath_search,
    FROM `project.dataset.ga_sessions_*` AS ga,  UNNEST(hits) AS hit
    WHERE _TABLE_SUFFIX = 'date'
                ),
keyword AS (
   SELECT 
    CONCAT(fullVisitorId, visitStartTime) as sessionId,
    hit.page.searchKeyword as keyword,
    hit.page.pagePath AS pagepath_result,
  FROM `project.dataset.ga_sessions_*` AS ga,  UNNEST(hits) AS hit
  WHERE _TABLE_SUFFIX = 'date'
            )

SELECT 
  keyword.keyword, 
  evento.pagepath_search, 
  COUNT(DISTINCT keywords.sessionId) AS total_sessions
FROM
  event INNER JOIN keyword ON event.sessionId = keyword.sessionId

我一直在学习这一点,所以,如果您知道可能更好的方法,那就打我吧!