通过BigQuery访问Firebase封闭式渠道中的结构和数组

时间:2017-08-10 17:04:09

标签: google-analytics google-bigquery firebase-analytics

本周我偶然发现了这个标准的SQL BigQuery documentation,这让我开始使用Firebase Analytics的封闭式漏斗。但是我得到了错误的结果(见下图)。在没有首先启动“Tutorial_LessonStarted>> Lesson = 1”之前,应该没有“Tutorial_LessonCompleted”的用户。这可能是由于各种原因。

问题:

  1. 使用User Property =“first_open_time”是明智的,还是使用Event =“first_open”更好。后一种实现将如何?
  2. 我怀疑我可能没有正确深入到:Event(String =“Tutorial_LessonStarted”)>>参数(String =“LessonNumber”)>> value(String =“lesson1”)?
  3. _TABLE_SUFFIX ='20170701'上的过滤器怎么可能有用,我读这会更便宜。任何优化的代码建议都是张开双臂和向上投票收到的!

    #standardSQL
    SELECT
      step1, step2, step3, step4, step5, step6,
      COUNT(*) AS funnel_count,
      COUNT(DISTINCT user_id) AS users 
    FROM (
    SELECT
        user_dim.app_info.app_instance_id AS user_id,
        event.timestamp_micros AS event_timestamp,
        event.name AS step1,
        LEAD(event.name, 1) OVER (
          PARTITION BY user_dim.app_info.app_instance_id 
          ORDER BY event.timestamp_micros ASC) as step2,
        LEAD(event.name, 2) OVER (
          PARTITION BY user_dim.app_info.app_instance_id 
          ORDER BY event.timestamp_micros ASC) as step3,
        LEAD(event.name, 3) OVER (
          PARTITION BY user_dim.app_info.app_instance_id 
          ORDER BY event.timestamp_micros ASC) as step4,
        LEAD(event.name, 4) OVER (
          PARTITION BY user_dim.app_info.app_instance_id 
          ORDER BY event.timestamp_micros ASC) as step5,
        LEAD(event.name, 5) OVER (
          PARTITION BY user_dim.app_info.app_instance_id 
          ORDER BY event.timestamp_micros ASC) as step6
    FROM
        `......`, 
        UNNEST(event_dim) AS event, 
        UNNEST(user_dim.user_properties) AS user_prop  
    WHERE user_prop.key = "first_open_time"
    ORDER BY 1, 2, 3, 4, 5 ASC
    )
    WHERE step6 = "Tutorial_LessonStarted" AND EXISTS (
    SELECT * 
    FROM `......`, 
    UNNEST(event_dim) AS event,
    UNNEST(event.params)
    WHERE key = 'LessonNumber' AND value.string_value = "lesson1") GROUP BY step1, step2, step3, step4, step5, step6
    ORDER BY funnel_count DESC
    LIMIT 100;
    
  4. 注意:

    1. 输入您的查询表FROM,即:project_id.com_game_example_IOS.app_events_20170212
    2. 我遗漏了funnel_count和user_count。
    3. 输出

      Output

      ----------------------------------------------- -----------

      自上述原始问题后更新:

      @Elliot:我不明白你为什么这么说: - 确保课程1的事件在Tutorial_LessonStarted之前。

      Tutorial_LessonStarted有一个参数“LessonNumber”,其值为lesson1,lesson2,lesson3,lesson4。

      我想计算漏斗中最后一步发生的所有漏斗等于LessonNumber = lesson1。

      因此,应用于全新用户的第一个会话(又名:触发first_open_time的用户)的事件日志数据,答案如下表所示:

      • View.OnboardingWelcomePage
      • View.OnboardingFinalPage
      • View.JamLoading
      • View.JamLoading
      • Jam.UserViewsJam
      • Jam.ProjectOpened
      • View.JamMixer
      • Tutorial.LessonStarted(此参数“LessonNumber”的值将等于“lesson1”)
      • Jam.ProjectPlayStarted
      • View.JamLoopSelector
      • View.JamMixer
      • View.JamLoopSelector
      • View.JamMixer
      • View.JamLoopSelector
      • View.JamMixer
      • Tutorial.LessonCompleted
      • Tutorial.LessonStarted(此参数“LessonNumber”的值将等于“lesson2”)

      因此,首先让所有在特定日期拥有 first_open_time 的用户都很重要,并将事件结构化为漏斗,以便漏斗中的最后一个事件与事件和特定参数值,然后从那里“向后”形成漏斗。

      enter image description here

1 个答案:

答案 0 :(得分:1)

让我先解释一下,然后看看我是否可以建议一个查询来帮助你入门。

看起来您想分析分析数据中的事件序列,但序列已经存在 - 您有一系列事件。查看Firebase schema for BigQueryevent_dim是相关列,除非我误解了某些内容,否则这些事件会按时间排序。如果您想查看第六个事件的名称,可以使用:

event_dim[SAFE_ORDINAL(6)].name

如果事件少于六个,这将评估为NULL,否则它将为您提供包含事件名称的字符串。

另一个观察结果是你试图分析event_dimuser_dim,但是你正在考虑两者的交叉积,这将会爆炸行数并使其很难推理查询的结果。要查找特定的用户属性,请使用以下表单的表达式:

(SELECT value.value.string_value
 FROM UNNEST(user_dim.user_properties)
 WHERE key = 'first_open_time') = '<expected property value>'

结合这两个过滤器,您的FROMWHERE子句将如下所示:

FROM `project_id.com_game_example_IOS.app_events_*`
WHERE _TABLE_SUFFIX = '20170701' AND
  event_dim[SAFE_ORDINAL(6)].name = 'Tutorial_LessonStarted' AND
  (SELECT value.value.string_value
   FROM UNNEST(user_dim.user_properties)
   WHERE key = 'first_open_time') = '<expected property value>'

使用括号运算符访问event_dim中的步骤,我们可以执行以下操作:

WITH FilteredInput AS (
  SELECT *
  FROM `project_id.com_game_example_IOS.app_events_*`
  WHERE _TABLE_SUFFIX = '20170701' AND
    event_dim[SAFE_ORDINAL(6)].name = 'Tutorial_LessonStarted' AND
    (SELECT value.value.string_value
     FROM UNNEST(user_dim.user_properties)
     WHERE key = 'first_open_time') = '<expected property value>' AND
    -- ensure that an event with lesson1 precedes Tutorial_LessonStarted
    EXISTS (
      SELECT 1
      FROM UNNEST(event_dim) WITH OFFSET event_offset
      CROSS JOIN UNNEST(params)
      WHERE key = 'LessonNumber' AND
        value.string_value = 'lesson1' AND
        event_offset < 5
    )
)
SELECT
  event_dim[ORDINAL(1)].name AS step1,
  event_dim[ORDINAL(2)].name AS step2,
  event_dim[ORDINAL(3)].name AS step3,
  event_dim[ORDINAL(4)].name AS step4,
  event_dim[ORDINAL(5)].name AS step5,
  event_dim[ORDINAL(6)].name AS step6,
  COUNT(*) AS funnel_count,
  COUNT(DISTINCT user_dim.user_id) AS users
FROM FilteredInput
GROUP BY step1, step2, step3, step4, step5, step6;

这将返回所有独特的&#34;路径&#34;以及每个用户的计数和数量。请注意,我只是把它写在我的头顶 - 我没有代表性的数据,我可以尝试 - 因此可能存在语法或其他错误。