我是新手使用BiqQuery(几周经验)并试图提高我的技能。我对以下非常有趣的查询提出了一个实际问题 (Recreate GA Funnel on BigQuery)用户Willian Fuks。它是关于BigQuery中的GA数据以高效的方式再现漏斗。
#standardSQL
SELECT
SUM((SELECT COUNTIF(eventInfo.eventAction = 'landing_page') FROM UNNEST(hits))) Landing_Page,
SUM((SELECT COUNTIF(eventInfo.eventAction = 'model_selection_page') FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page'))) Model_Selection
FROM `64269470.ga_sessions_20170720`
在示例中,使用了eventInfo.eventAction
。我尝试了几件事来让它与customDimension
一起工作,但我失败了。有谁知道我如何重现使用customDimension
代替eventInfo.eventAction
对其进行细分的查询?
我用过这个:
(SELECT MAX(IF(index=1,page1, NULL))FROM UNNEST(hits.customDimensions))
答案 0 :(得分:0)
使用customDimensions
更具挑战性,因为此字段也是ARRAY类型(重复)。实际上,主要区别在于需要进行另一次UNNEST
操作。除此之外,它是相同的逻辑。
以下是一些数据显示:
#standardSQL
WITH data AS(
SELECT '1' AS fullvisitorid, 1 AS visitid, ARRAY<STRUCT< hitNumber INT64, customDimension ARRAY<STRUCT<index INT64, value STRING> > >> [STRUCT(1 AS hitNumber, [STRUCT(1 AS index, 'landing_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
STRUCT(2 AS hitNumber, [STRUCT(1 AS index, 'value1' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension)] AS hits UNION ALL
SELECT '1' AS fullvisitorid, 2 AS visitid, ARRAY<STRUCT< hitNumber INT64, customDimension ARRAY<STRUCT<index INT64, value STRING> > >> [STRUCT(1 AS hitNumber, [STRUCT(1 AS index, 'landing_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
STRUCT(2 AS hitNumber, [STRUCT(1 AS index, 'landing_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension)] AS hits UNION ALL
SELECT '2' AS fullvisitorid, 1 AS visitid, ARRAY<STRUCT< hitNumber INT64, customDimension ARRAY<STRUCT<index INT64, value STRING> > >> [STRUCT(1 AS hitNumber, [STRUCT(1 AS index, 'model_selection_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
STRUCT(2 AS hitNumber, [STRUCT(1 AS index, 'value1' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension)] AS hits UNION ALL
SELECT '3' AS fullvisitorid, 1 AS visitid, ARRAY<STRUCT< hitNumber INT64, customDimension ARRAY<STRUCT<index INT64, value STRING> > >> [STRUCT(1 AS hitNumber, [STRUCT(1 AS index, 'landing_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
STRUCT(2 AS hitNumber, [STRUCT(3 AS index, 'model_selection_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension)] AS hits UNION ALL
SELECT '3' AS fullvisitorid, 2 AS visitid, ARRAY<STRUCT< hitNumber INT64, customDimension ARRAY<STRUCT<index INT64, value STRING> > >> [STRUCT(1 AS hitNumber, [STRUCT(1 AS index, 'landing_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
STRUCT(2 AS hitNumber, [STRUCT(3 AS index, 'model_selection_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
STRUCT(3 AS hitNumber, [STRUCT(3 AS index, 'model_selection_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension)] AS hits UNION ALL
SELECT '4' AS fullvisitorid, 1 AS visitid, ARRAY<STRUCT< hitNumber INT64, customDimension ARRAY<STRUCT<index INT64, value STRING> > >> [STRUCT(1 AS hitNumber, [STRUCT(1 AS index, 'landing_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
STRUCT(2 AS hitNumber, [STRUCT(3 AS index, 'model_selection_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
STRUCT(3 AS hitNumber, [STRUCT(3 AS index, 'model_selection_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension)] AS hits
)
每个用户(fullvisitorid
)和每个会话(visitid
)都有hits
ARRAY。请注意,我已将每次点击分隔为hitNumber
,我觉得这样做更容易理解。
以下查询计算发生索引为1且值为“landing_page”的customDimension
的总会话数,以及index为3且值为“model_selection_page”的位置:
#standardSQL
SELECT
SUM((SELECT 1 FROM UNNEST(hits), UNNEST(customDimension) WHERE index = 1 AND value = 'landing_page' LIMIT 1)) Landing_Page,
SUM((SELECT 1 FROM UNNEST(hits), UNNEST(customDimension) WHERE EXISTS(SELECT 1 FROM UNNEST(hits), UNNEST(customDimension) WHERE index = 1 AND value = 'landing_page') AND index = 3 AND value = 'model_selection_page' LIMIT 1)) Model_Selection
FROM
data
您可以使用模拟数据来更好地了解这里发生了什么。简而言之,请注意两个UNNEST
,首先是获取hits
内的值,第二个是customDimension
内的值。
字段Model_Selection
稍微复杂一点,因为它首先必须评估维度'landing_page'是否被触发,正如您在此表达式中所看到的那样:
EXISTS(SELECT 1 FROM UNNEST(hits), UNNEST(customDimension) WHERE index = 1 AND value = 'landing_page')
如果hits
位于“landing_page”维度的某个位置,则此表达式会在True
子句中返回WHERE
。
您还可以在用户级别上显示结果,如下所示:
#standardSQL
SELECT
COUNT(DISTINCT (SELECT fullvisitorid FROM UNNEST(hits), UNNEST(customDimension) WHERE index = 1 AND value = 'landing_page' LIMIT 1)) Landing_Page,
COUNT(DISTINCT (SELECT fullvisitorid FROM UNNEST(hits), UNNEST(customDimension) WHERE EXISTS(SELECT 1 FROM UNNEST(hits), UNNEST(customDimension) WHERE index = 1 AND value = 'landing_page') AND index = 3 AND value = 'model_selection_page' LIMIT 1)) Model_Selection
FROM
data
当您学习BigQuery时,我建议您使用模拟数据并测试观察输出的每个步骤。您可以使用UNNEST
,运行一些测试其输出的查询等等,以便更好,更深入地了解如何使用这些技术。