Question

Firebase通过Firebase远程配置提供拆分测试功能，但是缺乏在具有用户属性的群组部分中过滤保留的能力（实际上具有任何属性）。

为了解决这个问题，我正在寻找BigQuery，因为Firebase Analytics提供了将数据导出到此服务的可用方法。

但是我坚持了许多问题，谷歌没有答案或例子可能指出我正确的方向。

一般问题：

作为第一步，我需要聚合代表相同数据火力群组的数据，所以我可以确定我的计算是正确的：

下一步应该只是对查询应用约束，以便它们匹配自定义用户属性。

这是我到目前为止所得到的：

主要问题 - 用户计算差异很大。有时大约有100个用户，但有时接近1000个。

这是我使用的方法：

# 1

# Count users with `user_dim.first_open_timestamp_micros` 
# in specified period (w0 – week 1)
# this is the way firebase group users to cohorts 
# (who started app on the same day or during the same week) 
# https://support.google.com/firebase/answer/6317510

SELECT
  COUNT(DISTINCT user_dim.app_info.app_instance_id) as count
FROM
  (
   TABLE_DATE_RANGE
    (
     [admob-app-id-xx:xx_IOS.app_events_], 
     TIMESTAMP('2016-11-20'), 
     TIMESTAMP('2016-11-26')
    )
  )
WHERE
  STRFTIME_UTC_USEC(user_dim.first_open_timestamp_micros, '%Y-%m-%d')
  BETWEEN '2016-11-20' AND '2016-11-26'

# 2

# For each next period count events with 
# same first_open_timestamp
# Here is example for one of the weeks. 
# week 0 is Nov20-Nov26, week 1 is Nov27-Dec03

SELECT
  COUNT(DISTINCT user_dim.app_info.app_instance_id) as count
FROM
  (
   TABLE_DATE_RANGE
    (
     [admob-app-id-xx:xx_IOS.app_events_], 
     TIMESTAMP('2016-11-27'), 
     TIMESTAMP('2016-12-03')
    )
  )
WHERE
  STRFTIME_UTC_USEC(user_dim.first_open_timestamp_micros, '%Y-%m-%d')
  BETWEEN '2016-11-20' AND '2016-11-26'

# 3

# Now we have users for each week w1, w2, ... w5
# Calculate retention for each of them
# retention week 1 = w1 / w0 * 100 = 25.72181359
# rw2 = w2 / w1 * 100
# ...
# rw5 = w5 / w1 * 100

# 4 

# Shift week 0 by one and repeat from step 1

BigQuery查询提示请求

非常感谢任何有关构建复杂查询的提示和指示，这些提示可能会在一个步骤中汇总和计算此任务所需的所有数据。

Here is BigQuery Export schema if needed

附带问题：

为什么所有user_dim.device_info.device_id和user_dim.device_info.resettable_device_id都是null？
user_dim.app_info.app_id（如果firebase支持队友将会阅读此问题）
应该如何使用event_dim.timestamp_micros和event_dim.previous_timestamp_micros，我无法达到目的。

PS

来自Firebase队友的人会回答这个问题。关于通过过滤或显示大查询示例来扩展群组功能的Five month ago there are was one mention，但事情并没有发生。他们说，Firebase Analytics是他们所说的，谷歌分析已被弃用。现在，我花了第二天精益求精，并在现有的分析工具上构建自己的解决方案。我没有，堆栈溢出不是这个评论的地方，但是你在想什么？拆分测试可能会在语法上影响我的应用的保留。我的应用程序没有出售任何东西，漏斗和事件在许多情况下都不是有价值的指标。

Answer 1

非常感谢任何有关构建复杂查询的提示和指示，这些提示可能会在一个步骤中汇总和计算此任务所需的所有数据。

是的，通用bigquery可以正常工作

下面不是最通用的版本，但可以给你一个想法在此示例中，我使用Stack Overflow Data

中提供的Google BigQuery Public Datasets

首先进行子选择 - 活动 - 在大多数情况下，只需要重新编写以反映数据细节的内容。
它的作用是：
一个。定义要为分析设置的时间段在下面的例子中 - 它是一个月 - FORMAT_DATE（＆＃39;％Y-％m＆＃39;，...
但你可以分别使用year, week, day or anything else - •按年份 - FORMAT_DATE（＆＃39;％Y＆＃39;，DATE（answers.creation_date））AS期间
•按周 - FORMAT_DATE（＆＃39;％Y-％W＆＃39;，DATE（answers.creation_date））AS期间
•白天 - FORMAT_DATE（＆＃39;％Y-％m-％d＆＃39;，DATE（answers.creation_date））AS期间
•...
湾此外，它“仅过滤”您需要分析的事件/活动类型例如，`WHERE CONCAT（＆＃39; |＆＃39;，questions.tags，＆＃39; |＆＃39;）LIKE＆＃39;％| google-bigquery |％＆＃39;寻找google-bigquery标记问题的答案

其余的子查询更加通用，大多数可以按原样使用

#standardSQL
WITH activities AS (
  SELECT answers.owner_user_id AS id,
    FORMAT_DATE('%Y-%m', DATE(answers.creation_date)) AS period
  FROM `bigquery-public-data.stackoverflow.posts_answers` AS answers
  JOIN `bigquery-public-data.stackoverflow.posts_questions` AS questions
  ON questions.id = answers.parent_id
  WHERE CONCAT('|', questions.tags, '|') LIKE '%|google-bigquery|%' 
  GROUP BY id, period
), cohorts AS (
  SELECT id, MIN(period) AS cohort FROM activities GROUP BY id
), periods AS (
  SELECT period, ROW_NUMBER() OVER(ORDER BY period) AS num
  FROM (SELECT DISTINCT cohort AS period FROM cohorts)
), cohorts_size AS (
  SELECT cohort, periods.num AS num, COUNT(DISTINCT activities.id) AS ids 
  FROM cohorts JOIN activities ON activities.period = cohorts.cohort AND cohorts.id = activities.id
  JOIN periods ON periods.period = cohorts.cohort
  GROUP BY cohort, num
), retention AS (
  SELECT cohort, activities.period AS period, periods.num AS num, COUNT(DISTINCT cohorts.id) AS ids
  FROM periods JOIN activities ON activities.period = periods.period
  JOIN cohorts ON cohorts.id = activities.id 
  GROUP BY cohort, period, num 
)
SELECT 
  CONCAT(cohorts_size.cohort, ' - ',  FORMAT("%'d", cohorts_size.ids), ' users') AS cohort, 
  retention.num - cohorts_size.num AS period_lag, 
  retention.period as period_label,
  ROUND(retention.ids / cohorts_size.ids * 100, 2) AS retention , retention.ids AS rids
FROM retention
JOIN cohorts_size ON cohorts_size.cohort = retention.cohort
WHERE cohorts_size.cohort >= FORMAT_DATE('%Y-%m', DATE('2015-01-01'))
ORDER BY cohort, period_lag, period_label

您可以使用您选择的工具直观显示上述查询的结果注意：您可以使用period_lag或period_label
请参阅以下示例中使用它们的区别

with period_lag

with period_label

Firebase导出到BigQuery：保留群组查询

1 个答案: