我有一个表跟踪各种列表的所有订阅者。它看起来像:list_id,user_id,action_date,action。用户订阅列表ID,我们使用时间戳和action = 1捕获它。如果他们取消订阅,我们会为新条目添加时间戳并捕获操作= 0。
我们最近推出了一些新的订阅,我想弄清楚:
我仍然是SQL的新手,因此抛出皱纹,每个user_id的最新操作必须是1的操作,这让我失望。
任何人都可以帮助我吗?
答案 0 :(得分:0)
您首先需要获得每个订阅的实际状态。你可以在这样的语句中得到这个(在vanilla SQL中,你没有指定确切的服务器):
select a.user_id, a.list_id, a.last_date, a.action
into #tmp
from Actions a
join
(
select user_id, list_id, MAX(action_date) last_date
from Actions
where list_id in ('A', 'B')
group by user_id, list_id
) b
on a.user_id = b.user_id
and a.list_id = b.list_id
and a.action_date = b.last_date
在子查询'b'中,我们为每个user_id和list_id获得最大的action_date。然后我们必须加入到主表以捕获状态。结果存储在临时表#tmp。
中然后,您可以通过多种方式找到问题1和2的答案。例如, '有多少用户订阅了list_id A和list_id B'
select distinct user_id
from #tmp t1
where
exists (
select 1
from #tmp t2
where t1.user_id = t2.user_id
and t1.list_id = t2.list_id
and t2.list_id = 'A'
and t2.status = 1
)
and exists (
select 1
from #tmp t3
where t1.user_id = t3.user_id
and t1.list_id = t3.list_id
and t3.list_id = 'B'
and t3.status = 1
)
只有多少用户订阅了list_id A而不是list_id B
select distinct user_id
from #tmp t1
where
exists (
select 1
from #tmp t2
where t1.user_id = t2.user_id
and t1.list_id = t2.list_id
and t2.list_id = 'A'
and t2.status = 1
)
and **not** exists (
select 1
from #tmp t3
where t1.user_id = t3.user_id
and t1.list_id = t3.list_id
and t3.list_id = 'B'
and t3.status = 1
)
反之亦然
select distinct user_id
from #tmp t1
where
**not** exists (
select 1
from #tmp t2
where t1.user_id = t2.user_id
and t1.list_id = t2.list_id
and t2.list_id = 'A'
and t2.status = 1
)
and exists (
select 1
from #tmp t3
where t1.user_id = t3.user_id
and t1.list_id = t3.list_id
and t3.list_id = 'B'
and t3.status = 1
)
我使用过此表单,因为所有语句看起来都很相似,可以轻松更改。根据您的服务器和确切的表定义,有一些方法可以用更短的方式制定答案。例如,对于SQL Server,我更喜欢使用ROW_NUMBER来获取订阅的实际状态。第一个问题可以这样回答:
select user_id
from #tmp
where
list_id in ('A', 'B')
and status = 1
group by user_id
having COUNT(*) = 2
我没有检查真实数据的陈述,因此可能存在错误。我试图表达想法。