BigQuery新架构-Firebase中的会话持续时间分布表

时间:2018-09-21 08:21:57

标签: firebase google-bigquery firebase-analytics

Here is result how I want to have

这是根据旧架构的Bigquery,在旧数据源表中正常运行。这是根据旧模式的查询。假定这可以进行会话持续时间的合理分配,这意味着它将提供会话持续时间(秒)的明智会话数(例如,它在图像空白处)。

 select (case when engagement_time1 > 0 and engagement_time1 <= 30 then "0-30" else
(case when engagement_time1 > 30 and engagement_time1 <= 60 then "31-60" else 
(case when engagement_time1 > 60 and engagement_time1 <= 180 then "61-180" else
(case when engagement_time1 > 180 and engagement_time1 <= 300 then "181-300" else
(case when engagement_time1 > 300 and engagement_time1 <= 600 then "301-600" else
(case when engagement_time1 > 600 and engagement_time1 <= 1800 then "601-1800" else 
(case when engagement_time1 > 1800 then "1800+" else
"0" end) end) end) end) end) end) end) 
 as engagement_bracket_in_seconds,
count(*) total_sessions
, sum(engagement_time1) total_engagement_time
from
(
SELECT app_instance_id, sess_id, MIN(min_time) sess_start, MAX(max_time) sess_end, COUNT(*) records, MAX(sess_id) OVER(PARTITION BY app_instance_id) total_sessions,
   (ROUND((MAX(max_time)-MIN(min_time))/(1000*1000),1)) sess_length_seconds,
       sum(case when name = "user_engagement" then engagement_time else 0 end)/1000 engagement_time1

FROM (
  SELECT *, SUM(session_start) OVER(PARTITION BY app_instance_id ORDER BY min_time) sess_id
  FROM (
    SELECT *, IF(
                previous IS null 
                OR (min_time-previous)>(30*60*1000*1000),  # sessions broken by this inactivity 
                1, 0) session_start 
                #https://blog.modeanalytics.com/finding-user-sessions-sql/
    FROM (
      SELECT *, LAG(max_time, 1) OVER(PARTITION BY app_instance_id ORDER BY max_time) previous
      FROM (
        SELECT user_dim.app_info.app_instance_id
          , (SELECT MIN(timestamp_micros) FROM UNNEST(event_dim)) min_time
          , (SELECT MAX(timestamp_micros) FROM UNNEST(event_dim)) max_time,
                event.name,
                params.value.int_value engagement_time
        FROM `firebase-analytics-sample-data.ios_dataset.app_events_20160601`,
        UNNEST(event_dim) as event,
        UNNEST(event.params) as params,
        UNNEST(user_dim.user_properties) as user_params
        where (event.name = "user_engagement" and params.key = "engagement_time_msec")
        and
                (user_params.key = "access" and user_params.value.value.string_value = "true") and
                PARSE_DATE('%Y%m%d', event.date) >= date_sub("{{upto_date (yyyy-mm-dd)}}", interval {{last n days}} day) and
                PARSE_DATE('%Y%m%d', event.date) <= "{{upto_date (yyyy-mm-dd)}}"
      )
    )
  )
)
GROUP BY 1, 2

) where sess_id > 0
group by 1
ORDER BY (total_engagement_time/total_sessions)

旧查询的结果正确,如下图所示

Click to see the result here

  

由于Bigquery架构已更改,因此我尝试复制旧的   根据新架构查询,我也更改了数据源表(我是   确保所有事件都已记录在该表中)。

现在,我添加的代码行(根据新架构)替换了旧查询的某些部分,并且我将其作为一个整体运行,它也可以正常工作。

* 增加了部分(对新模式的附加) *

SELECT L.user_pseudo_id as app_instance_id ,R.min_time min_time,R.max_time max_time,L.event_name,L.eng_time engagement_time FROM  
              (SELECT  user_pseudo_id 
                        ,event_name
                        ,params.value.int_value eng_time
                    FROM `new_abc_datasource`,
                    UNNEST(event_params) as params,
                    UNNEST(user_properties) as user_params
                    where (event_name = "user_engagement" and params.key = "engagement_time_msec")
                    and
                            (user_params.key = "access" and user_params.value.string_value = "true") and
                            PARSE_DATE('%Y%m%d', event_date) >= date_sub("{{upto_date (yyyy-mm-dd)}}", interval {{last n days}} day) and
                            PARSE_DATE('%Y%m%d', event_date) <= "{{upto_date (yyyy-mm-dd)}}"
               ) as L 
               left join (SELECT  user_pseudo_id 
                       , MIN(event_timestamp) AS min_time
                      ,MAX(event_timestamp) AS max_time
                    FROM `analytics_151475732.events_*`,
                    UNNEST(event_params) as params,
                    UNNEST(user_properties) as user_params
                    where (event_name = "user_engagement" and params.key = "engagement_time_msec")
                    and
                            (user_params.key = "access" and user_params.value.string_value = "true") and
                            PARSE_DATE('%Y%m%d', event_date) >= date_sub("{{upto_date (yyyy-mm-dd)}}", interval {{last n days}} day) and
                            PARSE_DATE('%Y%m%d', event_date) <= "{{upto_date (yyyy-mm-dd)}}"
                    GROUP BY user_pseudo_id) as R 
                ON L.user_pseudo_id=R.user_pseudo_id

* 替换旧查询的一部分 *

 SELECT user_dim.app_info.app_instance_id 
          , (SELECT MIN(timestamp_micros) FROM UNNEST(event_dim)) min_time
          , (SELECT MAX(timestamp_micros) FROM UNNEST(event_dim)) max_time,
                event.name,
                params.value.int_value engagement_time
        FROM `firebase-analytics-sample-data.ios_dataset.app_events_20160601`,
        UNNEST(event_dim) as event,
        UNNEST(event.params) as params,
        UNNEST(user_dim.user_properties) as user_params
        where (event.name = "user_engagement" and params.key = "engagement_time_msec")
        and
                (user_params.key = "access" and user_params.value.value.string_value = "true") and
                PARSE_DATE('%Y%m%d', event.date) >= date_sub("{{upto_date (yyyy-mm-dd)}}", interval {{last n days}} day) and
                PARSE_DATE('%Y%m%d', event.date) <= "{{upto_date (yyyy-mm-dd)}}"
  

新结果中的问题,减少了total_sessions   剧烈地。有人可以引导我说,这是对的吗   复制旧查询?因为我得到的结果是不同的   从旧的过去给出的结果开始。

0 个答案:

没有答案