本周我偶然发现了这个标准的SQL BigQuery documentation,这让我开始使用Firebase Analytics的封闭式漏斗。但是我得到了错误的结果(见下图)。在没有首先启动“Tutorial_LessonStarted>> Lesson = 1”之前,应该没有“Tutorial_LessonCompleted”的用户。这可能是由于各种原因。
问题:
_TABLE_SUFFIX ='20170701'上的过滤器怎么可能有用,我读这会更便宜。任何优化的代码建议都是张开双臂和向上投票收到的!
#standardSQL
SELECT
step1, step2, step3, step4, step5, step6,
COUNT(*) AS funnel_count,
COUNT(DISTINCT user_id) AS users
FROM (
SELECT
user_dim.app_info.app_instance_id AS user_id,
event.timestamp_micros AS event_timestamp,
event.name AS step1,
LEAD(event.name, 1) OVER (
PARTITION BY user_dim.app_info.app_instance_id
ORDER BY event.timestamp_micros ASC) as step2,
LEAD(event.name, 2) OVER (
PARTITION BY user_dim.app_info.app_instance_id
ORDER BY event.timestamp_micros ASC) as step3,
LEAD(event.name, 3) OVER (
PARTITION BY user_dim.app_info.app_instance_id
ORDER BY event.timestamp_micros ASC) as step4,
LEAD(event.name, 4) OVER (
PARTITION BY user_dim.app_info.app_instance_id
ORDER BY event.timestamp_micros ASC) as step5,
LEAD(event.name, 5) OVER (
PARTITION BY user_dim.app_info.app_instance_id
ORDER BY event.timestamp_micros ASC) as step6
FROM
`......`,
UNNEST(event_dim) AS event,
UNNEST(user_dim.user_properties) AS user_prop
WHERE user_prop.key = "first_open_time"
ORDER BY 1, 2, 3, 4, 5 ASC
)
WHERE step6 = "Tutorial_LessonStarted" AND EXISTS (
SELECT *
FROM `......`,
UNNEST(event_dim) AS event,
UNNEST(event.params)
WHERE key = 'LessonNumber' AND value.string_value = "lesson1") GROUP BY step1, step2, step3, step4, step5, step6
ORDER BY funnel_count DESC
LIMIT 100;
注意:
project_id.com_game_example_IOS.app_events_20170212
,输出
自上述原始问题后更新:
@Elliot:我不明白你为什么这么说: - 确保课程1的事件在Tutorial_LessonStarted之前。
Tutorial_LessonStarted有一个参数“LessonNumber”,其值为lesson1,lesson2,lesson3,lesson4。
我想计算漏斗中最后一步发生的所有漏斗等于LessonNumber = lesson1。
因此,应用于全新用户的第一个会话(又名:触发first_open_time的用户)的事件日志数据,答案如下表所示:
因此,首先让所有在特定日期拥有 first_open_time 的用户都很重要,并将事件结构化为漏斗,以便漏斗中的最后一个事件与事件和特定参数值,然后从那里“向后”形成漏斗。
答案 0 :(得分:1)
让我先解释一下,然后看看我是否可以建议一个查询来帮助你入门。
看起来您想分析分析数据中的事件序列,但序列已经存在 - 您有一系列事件。查看Firebase schema for BigQuery,event_dim
是相关列,除非我误解了某些内容,否则这些事件会按时间排序。如果您想查看第六个事件的名称,可以使用:
event_dim[SAFE_ORDINAL(6)].name
如果事件少于六个,这将评估为NULL
,否则它将为您提供包含事件名称的字符串。
另一个观察结果是你试图分析event_dim
和user_dim
,但是你正在考虑两者的交叉积,这将会爆炸行数并使其很难推理查询的结果。要查找特定的用户属性,请使用以下表单的表达式:
(SELECT value.value.string_value
FROM UNNEST(user_dim.user_properties)
WHERE key = 'first_open_time') = '<expected property value>'
结合这两个过滤器,您的FROM
和WHERE
子句将如下所示:
FROM `project_id.com_game_example_IOS.app_events_*`
WHERE _TABLE_SUFFIX = '20170701' AND
event_dim[SAFE_ORDINAL(6)].name = 'Tutorial_LessonStarted' AND
(SELECT value.value.string_value
FROM UNNEST(user_dim.user_properties)
WHERE key = 'first_open_time') = '<expected property value>'
使用括号运算符访问event_dim
中的步骤,我们可以执行以下操作:
WITH FilteredInput AS (
SELECT *
FROM `project_id.com_game_example_IOS.app_events_*`
WHERE _TABLE_SUFFIX = '20170701' AND
event_dim[SAFE_ORDINAL(6)].name = 'Tutorial_LessonStarted' AND
(SELECT value.value.string_value
FROM UNNEST(user_dim.user_properties)
WHERE key = 'first_open_time') = '<expected property value>' AND
-- ensure that an event with lesson1 precedes Tutorial_LessonStarted
EXISTS (
SELECT 1
FROM UNNEST(event_dim) WITH OFFSET event_offset
CROSS JOIN UNNEST(params)
WHERE key = 'LessonNumber' AND
value.string_value = 'lesson1' AND
event_offset < 5
)
)
SELECT
event_dim[ORDINAL(1)].name AS step1,
event_dim[ORDINAL(2)].name AS step2,
event_dim[ORDINAL(3)].name AS step3,
event_dim[ORDINAL(4)].name AS step4,
event_dim[ORDINAL(5)].name AS step5,
event_dim[ORDINAL(6)].name AS step6,
COUNT(*) AS funnel_count,
COUNT(DISTINCT user_dim.user_id) AS users
FROM FilteredInput
GROUP BY step1, step2, step3, step4, step5, step6;
这将返回所有独特的&#34;路径&#34;以及每个用户的计数和数量。请注意,我只是把它写在我的头顶 - 我没有代表性的数据,我可以尝试 - 因此可能存在语法或其他错误。