在BigQuery上重新创建GA Funnel

时间:2017-07-26 16:42:26

标签: google-analytics google-bigquery funnelweb

我正在尝试使用BigQuery重新创建GA漏斗(Google360上的自定义报告)。 GA上的漏斗使用每页上发生的唯一事件计数。我在网上发现这个查询大部分都在使用:

SELECT
  COUNT( s0.firstHit) AS Landing_Page,
  COUNT( s1.firstHit) AS Model_Selection
from(
SELECT
      s0.fullvisitorID,
      s0.firstHit,
      s1.firstHit,
    FROM (
            # Begin Subquery #1 aka s0
            SELECT
                    fullvisitorID,
                    MIN(hits.hitNumber) AS firstHit
            FROm [64269470.ga_sessions_20170720]
            WHERE
                  hits.eventInfo.eventAction  in ('landing_page') 
                    AND totals.visits = 1
            GROUP BY
                  fullvisitorID
                  ) s0
    # End Subquery #1 aka s0

    left join (

    # Begin Subquery #2 aka s1
          SELECT
              fullvisitorID,
              MIN(hits.hitNumber) AS firstHit
          FROM [64269470.ga_sessions_20170720]
          WHERE
            hits.eventInfo.eventAction  in ('model_selection_page')
            AND totals.visits = 1
          GROUP BY
                fullvisitorID,
                ) s1

      ON
    s0.fullvisitorID = s1.fullvisitorID

    )

查询工作正常,着陆页的值与我在GA上的值相同,但Model_Selection大约高出10%。这个差异也随着漏斗而增加(为了清晰起见,我只发布了2个步骤)。 知道我在这里缺少什么吗?

1 个答案:

答案 0 :(得分:1)

此查询可以在Standard SQL版本:

中执行您所需的操作
#standardSQL
SELECT 
  SUM((SELECT COUNTIF(eventInfo.eventAction = 'landing_page') FROM UNNEST(hits))) Landing_Page,
  SUM((SELECT COUNTIF(eventInfo.eventAction = 'model_selection_page') FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page'))) Model_Selection  
FROM `64269470.ga_sessions_20170720`

就是这样。 4行,方式更快,更便宜。

您还可以使用模拟数据,例如:

#standardSQL
WITH data AS(
  SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo)] AS hits UNION ALL
  SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo), STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo)] AS hits UNION ALL
  SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo), STRUCT(STRUCT('model_selection_page' AS eventAction) AS eventInfo)] AS hits UNION ALL
  SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('model_selection_page' AS eventAction) AS eventInfo), STRUCT(STRUCT('model_selection_page' AS eventAction) AS eventInfo)] AS hits
)

SELECT 
  SUM((SELECT COUNTIF(eventInfo.eventAction = 'landing_page') FROM UNNEST(hits))) Landing_Page,
  SUM((SELECT COUNTIF(eventInfo.eventAction = 'model_selection_page') FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page'))) Model_Selection  
FROM data

请注意,在GA中构建此类型的报告可能会有点困难,因为您需要选择至少在事件“landing_page”之后触发并且然后触发事件“model_selection_page”的访问者。确保您的GA中也正确构建了此报告(一种方法可能是首先构建一个自定义报告,仅针对已经'landing_page'被触发的客户,然后应用第二个过滤器查找'model_selection_page')。

[编辑]:

您在评论中询问了如何计算会话和用户级别。对于每个会话的计数,您可以将每个子查询评估的结果限制为1,如下所示:

SELECT 
  SUM((SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page' LIMIT 1)) Landing_Page,
  SUM((SELECT 1 FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page') AND eventInfo.eventAction = 'model_selection_page' LIMIT 1)) Model_Selection  
FROM data

对于计算不同的用户,这个想法是一样的,但您必须应用COUNT(DISTINCT)操作,如下所示:

SELECT 
  COUNT(DISTINCT(SELECT fullvisitorid FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page' LIMIT 1)) Landing_Page,
  COUNT(DISTINCT(SELECT fullvisitorid FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page') AND eventInfo.eventAction = 'model_selection_page' LIMIT 1)) Model_Selection  
FROM data