我非常感谢您的帮助。 购买旅行团时,我有一组数据。每个游览都有一个Purchaser_Email和Event_Date以及其他不相关的列。 我想进行一次专栏旅行,以确定该事件是新旅行还是同一旅行。 要将新购买的商品标识为新行程,两个Event_Date之间的差额必须超过30天。如果不是,那次旅行被视为同一次旅行。最后,我需要知道客户进行了多少次旅行,并按旅行将购买分组。 我用ROW()NUMBER进行查询,并计算第一次购买和下一次购买之间的date_diff。我觉得我已经很近了,但是我需要一些帮助来添加“旅行专栏”。
我需要这样的东西: Desired Colum
在此文件中是示例数据集和我需要的列:https://docs.google.com/spreadsheets/d/1ToNFQ9l2-ztDrN2zSlKlgBQk95vO6BnRv6VabWrHBmM/edit?usp=sharing RAW数据是第一个标签, 在第二个选项卡中,下面的查询结果为橙色列,红色的最后一列是我要查找的列。
Lazy<Entries>
答案 0 :(得分:2)
您正在做的对。在对row_number()
或rank()
进行分区和分配后,您可以根据两次购买存在一定增量的购买条件来分配布尔参数。
这是实现此目标的一种方法:
with data as (
select purchaser_email, event_date, rank() over (partition by purchaser_email order by event_date) as indx from (
select 'abc_xyz@xyz.com' as purchaser_email, date('2018-10-15') as event_date union all
select 'abc_xyz@xyz.com' as purchaser_email, date('2018-10-12') as event_date union all
select 'abc_xyz@xyz.com' as purchaser_email, date('2018-10-19') as event_date union all
select 'fgh_xyz@xyz.com' as purchaser_email, date('2018-10-03') as event_date union all
select 'fgh_xyz@xyz.com' as purchaser_email, date('2018-10-10') as event_date union all
select 'fgh_xyz@xyz.com' as purchaser_email, date('2018-11-26') as event_date union all
select 'abc_xyz@xyz.com' as purchaser_email, date('2018-11-28') as event_date union all
select 'abc_xyz@xyz.com' as purchaser_email, date('2018-12-30') as event_date union all
select 'abc_xyz@xyz.com' as purchaser_email, date('2018-12-31') as event_date
)
)
select purchaser_email, count(1) as order_count from (
select purchaser_email,
d1, new_purchase, sum(case when new_purchase=true then 1 else 0 end) over (partition by purchaser_email order by d1) as purchase_count from (
select
t1.purchaser_email,
t1.event_date as d1,
t2.event_date as d2,
t1.indx as t1i,
t2.indx as t2i,
case
when t2.event_date is null then true
when abs(date_diff(t1.event_date, t2.event_date, day)) >= 30 then true
else false end as new_purchase
from data t1
left join data t2 on t1.purchaser_email = t2.purchaser_email and t1.indx-1 = t2.indx
)
order by 1,2,3
)
where new_purchase = true
group by 1
order by 1