我想确定在第a个月(这里:1月)发生过“ first_open”事件并在第b个月(这里:2月)回到我们的用户中发生了“ user_engagement”事件的用户。
我的想法: 1.创建一个包含所有具有“ first_open”事件的用户的表 2.创建一个包含所有具有“ user_engagement”事件的用户的表 3.在userID上连接两个表 4.计算在a和b月份都发生过“ first_open”事件的用户,并用“ first_open”事件对一月份的所有用户进行计数
通过以下查询,我当前在a和b两个月都对用户进行了超额计数,因为我没有对这两种事件类型都计算出所有不正常的用户。
With
users_first_open as (select
user_pseudo_id,
EXTRACT (Month FROM(DATE(TIMESTAMP_MICROS(user_first_touch_timestamp)))) AS install_month,
event_name as firstopen
FROM
`table.events_*`
where _TABLE_SUFFIX BETWEEN '20190101'
AND '20190108' and event_name = "first_open" and
EXTRACT (Month FROM(DATE(TIMESTAMP_MICROS(user_first_touch_timestamp)))) = 1
),
user_enagement_next_month as (select
user_pseudo_id,
EXTRACT (Month FROM(DATE(TIMESTAMP_MICROS(user_first_touch_timestamp)))) AS engagement_month,
event_name as engagament_next_month
FROM
`table.events_*`
where _TABLE_SUFFIX BETWEEN '20190109'
AND '20190116' and event_name = "user_engagement"
and EXTRACT (Month FROM(DATE(TIMESTAMP_MICROS(user_first_touch_timestamp)))) = 1),
cohort_raw as(
select
user_pseudo_id,
install_month,
engagement_month,
case when firstopen = "first_open" then 1 else 0 end as cohort_count_first_open,
case when engagament_next_month = "user_engagement" then 1 else 0 end as cohort_count_engagement
from
user_enagement_next_month
full join
users_first_open using (user_pseudo_id))--,
select
sum(case when cohort_count_first_open is not null then 1 else 0 end) as users_first_open,
(select sum(case when cohort_count_engagement is not null then 1 else 0 end) as u_engagement_open from cohort_raw where cohort_count_first_open = 1) as users_engagement_open
from cohort_raw
我接下来尝试的是以下操作:表2中按userID等分组的“ user_enagement_next_month” 并在结果成立时创建“ first_open”案例和“订婚”案例的总和。在后来的版本中,我加入了查询,以仅统计这两个用户的计数等于2的用户
With
users_first_open as (select
user_pseudo_id,
EXTRACT (Month FROM(DATE(TIMESTAMP_MICROS(user_first_touch_timestamp)))) AS install_month,
event_name as firstopen
FROM
`table.events_*`
where _TABLE_SUFFIX BETWEEN '20190101'
AND '20190131' and event_name = "first_open" and
EXTRACT (Month FROM(DATE(TIMESTAMP_MICROS(user_first_touch_timestamp)))) = 1
),
user_enagement_next_month as (select
user_pseudo_id,
EXTRACT (Month FROM(DATE(TIMESTAMP_MICROS(user_first_touch_timestamp)))) AS engagement_month,
event_name as engagament_next_month
FROM
`table.events_*`
where _TABLE_SUFFIX BETWEEN '20190201'
AND '20190228' and event_name = "session_start"
and EXTRACT (Month FROM(DATE(TIMESTAMP_MICROS(user_first_touch_timestamp)))) = 2
group by 1,2,3)--,
--cohort_raw as(
select
user_pseudo_id,
install_month,
engagement_month,
case when firstopen = "first_open" then 1 else 0 end as cohort_count_first_open,
case when engagament_next_month = "session_start" then 1 else 0 end as cohort_count_engagement
--case when user_pseudo_id is not null then 1 else 0 end as cohort_count_engagement
from
user_enagement_next_month
full join
users_first_open using (user_pseudo_id)),
cohort_agg as (
select *, cohort_count_first_open+cohort_count_engagement as cohort_sum
from cohort_raw
group by 1,2,3,4,5
order by 6 desc)
select
(select count(*) from users_first_open) as cohort_jan,
(select Sum(cohort_sum) from cohort_agg where cohort_sum = 2) as ret,
sum(case when cohort_count_first_open is not null then 1 else 0 end) as users_first_open,
(select sum(case when cohort_count_engagement is not null then 1 else 0 end) as u_engagement_open from cohort_raw where cohort_count_first_open = 1) as users_engagement_open
from cohort_agg
我预计回报率约为20%。目前,我的输出为54%,因为在我的查询中,我要么计算过多,要么计数很少,因为我假设我的联接不起作用。
答案 0 :(得分:0)
也许我不太清楚你想要什么,但是尝试这个
with
users_first_open as (
select distinct -- is there duplicates for one user_id?
user_pseudo_id,
extract(
month from
timestamp_micros(user_first_touch_timestamp)
) as install_month
from
`table.events_201901*` -- longer prefixes generally perform better
where
_table_suffix between '01' and '31'
and event_name = 'first_open'
and extract(
month from
timestamp_micros(user_first_touch_timestamp)
) = 1
),
user_enagement_next_month as (
select distinct
user_pseudo_id,
extract(
month from
timestamp_micros(user_first_touch_timestamp)
) as engagement_month
from
`table.events_201902*` -- longer prefixes generally perform better
where
_table_suffix between '01' and '28'
and event_name = 'user_engagement'
and extract(
month from
timestamp_micros(user_first_touch_timestamp)
) = 2
)
select
ufo.install_month,
uenm.engagement_month,
count(*) as first_open_event_users_cnt,
count(uenm.user_pseudo_id) as user_engagement_event_users_cnt
from
users_first_open as ufo
left join user_enagement_next_month as uenm
on ufo.user_pseudo_id = uenm.user_pseudo_id
group by
1, 2