Bigquery中新安装的用户的Firebase事件发生

时间:2017-10-03 17:37:44

标签: google-bigquery firebase-analytics

鉴于用户的安装日期,我想获取所有200多个用户的Firebase(1)事件发生次数和(2)事件区别用户数第0天到第30天的Firebase活动。我在屏幕截图中模拟了下面的输出表(对于D0-D30),但代码仅适用于Day0-Day7。

(1)事件发生

enter image description here

SELECT
  event.name as event_name,
  COUNT(CASE WHEN _TABLE_SUFFIX >= '20170801' AND _TABLE_SUFFIX < '20170802' THEN event_count END) AS D0_USERS,
  COUNT(CASE WHEN _TABLE_SUFFIX >= '20170802' AND _TABLE_SUFFIX < '20170803' THEN event_count END) AS D1_USERS,
  COUNT(CASE WHEN _TABLE_SUFFIX >= '20170803' AND _TABLE_SUFFIX < '20170804' THEN event_count END) AS D2_USERS,
  COUNT(CASE WHEN _TABLE_SUFFIX >= '20170804' AND _TABLE_SUFFIX < '20170805' THEN event_count END) AS D3_USERS,
  COUNT(CASE WHEN _TABLE_SUFFIX >= '20170805' AND _TABLE_SUFFIX < '20170806' THEN event_count END) AS D4_USERS,
  COUNT(CASE WHEN _TABLE_SUFFIX >= '20170806' AND _TABLE_SUFFIX < '20170807' THEN event_count END) AS D5_USERS,  
  COUNT(CASE WHEN _TABLE_SUFFIX >= '20170807' AND _TABLE_SUFFIX < '20170808' THEN event_count END) AS D6_USERS,  
  COUNT(CASE WHEN _TABLE_SUFFIX >= '20170808' AND _TABLE_SUFFIX < '20170809' THEN event_count END) AS D7_USERS    
FROM `<<project-id>>.app_events_*`, UNNEST(event_dim) AS event
WHERE
  _TABLE_SUFFIX >= '20170801' AND _TABLE_SUFFIX < '20170809' AND
  user_dim.first_open_timestamp_micros BETWEEN 1501545600000000 AND 1501632000000000;

(2)不同用户的计数

enter image description here

SELECT
  event.name as event_name,
  COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX >= '20170801' AND _TABLE_SUFFIX < '20170802' THEN user_dim.app_info.app_instance_id END) AS D0_USERS,
  COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX >= '20170802' AND _TABLE_SUFFIX < '20170803' THEN user_dim.app_info.app_instance_id END) AS D1_USERS,
  COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX >= '20170803' AND _TABLE_SUFFIX < '20170804' THEN user_dim.app_info.app_instance_id END) AS D2_USERS,
  COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX >= '20170804' AND _TABLE_SUFFIX < '20170805' THEN user_dim.app_info.app_instance_id END) AS D3_USERS,
  COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX >= '20170805' AND _TABLE_SUFFIX < '20170806' THEN user_dim.app_info.app_instance_id END) AS D4_USERS,
  COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX >= '20170806' AND _TABLE_SUFFIX < '20170807' THEN user_dim.app_info.app_instance_id END) AS D5_USERS,  
  COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX >= '20170807' AND _TABLE_SUFFIX < '20170808' THEN user_dim.app_info.app_instance_id END) AS D6_USERS,  
  COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX >= '20170808' AND _TABLE_SUFFIX < '20170809' THEN user_dim.app_info.app_instance_id END) AS D7_USERS    
FROM `<<project-id>>.app_events_*`, UNNEST(event_dim) AS event
WHERE
  _TABLE_SUFFIX >= '20170801' AND _TABLE_SUFFIX < '20170809'
  AND user_dim.first_open_timestamp_micros BETWEEN 1501545600000000 AND 1501632000000000
GROUP BY 1;

问题:

  • 有更优化的方式来写这个吗?对于少量的列,它是有意义的(D0-D7),但对于D0-D30,我认为可能有更好的方法。任何建议都非常感谢!

米哈伊尔的反馈后的最终答案:

我在一个查询中合并了两个查询,然后创建了一个数据透视表。请记住在执行之前在BigQuery编辑器中选择“标准SQL”。

SELECT
  event.name AS event_name,
  _TABLE_SUFFIX as day,
  COUNT(1) as event_occurances,
  COUNT(DISTINCT user_dim.app_info.app_instance_id) as event_unique_users
FROM `<<project-id>>.app_events_*`, UNNEST(event_dim) AS event
WHERE
  _TABLE_SUFFIX >= '20170801' AND _TABLE_SUFFIX < '20170901' AND
  user_dim.first_open_timestamp_micros BETWEEN 1501545600000000 AND 1501632000000000
GROUP BY event_name, day
ORDER BY event_name;

附录说明:

2017年8月1日的时间戳转换

  • 大纪元时间戳:1501545600
  • 时间戳(以毫秒为单位):1501545600000

2017年8月2日的时间戳转换

  • 大纪元时间戳:1501632000
  • 时间戳(以毫秒为单位):1501632000000

1 个答案:

答案 0 :(得分:1)

  

有更优化的方式来写这个吗?

<强> 1 即可。优化这种方法的一种方法是重写

COUNT(CASE WHEN _TABLE_SUFFIX >= '20170801' AND _TABLE_SUFFIX < '20170802' THEN event_count END) AS D0_USERS

到这个

COUNTIF(_TABLE_SUFFIX = '20170801') AS D0_USERS

:o(你仍然需要为D0-D30案例写这一行31次,但至少它不那么沉重

<强> 2 即可。另一种(正确的)方法是遵循最佳实践并从数据可视化中分离数据检索

所以你可以做下面的事情来检索所需的数据

#standardSQL
SELECT
  event.name AS event_name,
  _TABLE_SUFFIX as day,
  COUNT(1) as users
FROM `<<project-id>>.app_events_*`, UNNEST(event_dim) AS event
WHERE
  _TABLE_SUFFIX >= '20170801' AND _TABLE_SUFFIX < '20170809' AND
  user_dim.first_open_timestamp_micros BETWEEN 1501545600000000 AND 1501632000000000
GROUP BY event_name, day   

然后,您可以使用您喜欢的任何工具来转动此结果

例如,如果BigQuery Mate没有离开用户界面,则可以获得如下所示的

enter image description here

作为快速披露 - 我是BigQuery Mate Chrome扩展程序的作者

请注意:我没有调整或改变您查询的任何逻辑 - 我只是回答了您的具体问题 - 是否有更优化的方式来编写它?