按customDimension的特定值选择(BQ<> GA)

时间:2017-08-24 20:22:29

标签: google-analytics google-bigquery

我是新手使用BiqQuery(几周经验)并试图提高我的技能。我对以下非常有趣的查询提出了一个实际问题 (Recreate GA Funnel on BigQuery)用户Willian Fuks。它是关于BigQuery中的GA数据以高效的方式再现漏斗。

#standardSQL
SELECT 
SUM((SELECT COUNTIF(eventInfo.eventAction = 'landing_page') FROM UNNEST(hits))) Landing_Page,
SUM((SELECT COUNTIF(eventInfo.eventAction = 'model_selection_page') FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page'))) Model_Selection  
FROM `64269470.ga_sessions_20170720`

在示例中,使用了eventInfo.eventAction。我尝试了几件事来让它与customDimension一起工作,但我失败了。有谁知道我如何重现使用customDimension代替eventInfo.eventAction对其进行细分的查询?

我用过这个:

(SELECT MAX(IF(index=1,page1, NULL))FROM UNNEST(hits.customDimensions)) 

1 个答案:

答案 0 :(得分:0)

使用customDimensions更具挑战性,因为此字段也是ARRAY类型(重复)。实际上,主要区别在于需要进行另一次UNNEST操作。除此之外,它是相同的逻辑。

以下是一些数据显示:

#standardSQL
WITH data AS(
  SELECT '1' AS fullvisitorid, 1 AS visitid, ARRAY<STRUCT< hitNumber INT64, customDimension ARRAY<STRUCT<index INT64, value STRING> >  >> [STRUCT(1 AS hitNumber, [STRUCT(1 AS index, 'landing_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
                                                                                                                                           STRUCT(2 AS hitNumber, [STRUCT(1 AS index, 'value1' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension)] AS hits UNION ALL

  SELECT '1' AS fullvisitorid, 2 AS visitid, ARRAY<STRUCT< hitNumber INT64, customDimension ARRAY<STRUCT<index INT64, value STRING> >  >> [STRUCT(1 AS hitNumber, [STRUCT(1 AS index, 'landing_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
                                                                                                                                           STRUCT(2 AS hitNumber, [STRUCT(1 AS index, 'landing_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension)] AS hits UNION ALL

  SELECT '2' AS fullvisitorid, 1 AS visitid, ARRAY<STRUCT< hitNumber INT64, customDimension ARRAY<STRUCT<index INT64, value STRING> >  >> [STRUCT(1 AS hitNumber, [STRUCT(1 AS index, 'model_selection_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
                                                                                                                                           STRUCT(2 AS hitNumber, [STRUCT(1 AS index, 'value1' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension)] AS hits UNION ALL

  SELECT '3' AS fullvisitorid, 1 AS visitid, ARRAY<STRUCT< hitNumber INT64, customDimension ARRAY<STRUCT<index INT64, value STRING> >  >> [STRUCT(1 AS hitNumber, [STRUCT(1 AS index, 'landing_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
                                                                                                                                           STRUCT(2 AS hitNumber, [STRUCT(3 AS index, 'model_selection_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension)] AS hits UNION ALL

  SELECT '3' AS fullvisitorid, 2 AS visitid, ARRAY<STRUCT< hitNumber INT64, customDimension ARRAY<STRUCT<index INT64, value STRING> >  >> [STRUCT(1 AS hitNumber, [STRUCT(1 AS index, 'landing_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
                                                                                                                                           STRUCT(2 AS hitNumber, [STRUCT(3 AS index, 'model_selection_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
                                                                                                                                           STRUCT(3 AS hitNumber, [STRUCT(3 AS index, 'model_selection_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension)] AS hits UNION ALL
  SELECT '4' AS fullvisitorid, 1 AS visitid, ARRAY<STRUCT< hitNumber INT64, customDimension ARRAY<STRUCT<index INT64, value STRING> >  >> [STRUCT(1 AS hitNumber, [STRUCT(1 AS index, 'landing_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
                                                                                                                                           STRUCT(2 AS hitNumber, [STRUCT(3 AS index, 'model_selection_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
                                                                                                                                           STRUCT(3 AS hitNumber, [STRUCT(3 AS index, 'model_selection_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension)] AS hits                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
)

每个用户(fullvisitorid)和每个会话(visitid)都有hits ARRAY。请注意,我已将每次点击分隔为hitNumber,我觉得这样做更容易理解。

以下查询计算发生索引为1且值为“landing_page”的customDimension的总会话数,以及index为3且值为“model_selection_page”的位置:

#standardSQL
SELECT 
  SUM((SELECT 1 FROM UNNEST(hits), UNNEST(customDimension) WHERE index = 1 AND value = 'landing_page' LIMIT 1)) Landing_Page,
  SUM((SELECT 1 FROM UNNEST(hits), UNNEST(customDimension) WHERE EXISTS(SELECT 1 FROM UNNEST(hits), UNNEST(customDimension) WHERE index = 1 AND value = 'landing_page') AND index = 3 AND value = 'model_selection_page' LIMIT 1)) Model_Selection  
FROM
  data

您可以使用模拟数据来更好地了解这里发生了什么。简而言之,请注意两个UNNEST,首先是获取hits内的值,第二个是customDimension内的值。

字段Model_Selection稍微复杂一点,因为它首先必须评估维度'landing_page'是否被触发,正如您在此表达式中所看到的那样:

EXISTS(SELECT 1 FROM UNNEST(hits), UNNEST(customDimension) WHERE index = 1 AND value = 'landing_page')

如果hits位于“landing_page”维度的某个位置,则此表达式会在True子句中返回WHERE

您还可以在用户级别上显示结果,如下所示:

#standardSQL
SELECT 
  COUNT(DISTINCT (SELECT fullvisitorid FROM UNNEST(hits), UNNEST(customDimension) WHERE index = 1 AND value = 'landing_page' LIMIT 1)) Landing_Page,
  COUNT(DISTINCT (SELECT fullvisitorid FROM UNNEST(hits), UNNEST(customDimension) WHERE EXISTS(SELECT 1 FROM UNNEST(hits), UNNEST(customDimension) WHERE index = 1 AND value = 'landing_page') AND index = 3 AND value = 'model_selection_page' LIMIT 1)) Model_Selection  
FROM
  data

当您学习BigQuery时,我建议您使用模拟数据并测试观察输出的每个步骤。您可以使用UNNEST,运行一些测试其输出的查询等等,以便更好,更深入地了解如何使用这些技术。