我有一个称为活动的表,该表具有一个memberId和一个时间戳。我想找出在给定的月份中有多少成员执行了一项活动(即-在活动表中有记录),而在过去12个月中没有进行过某项活动。我认为领先/落后在这里会有所帮助,但我无法将自己的大脑包在周围。
(我在这里同时标记了Apache Hadoop和MS SQL Server,因为我可以在两者中都这样做,而且我认为我可以很容易地将一个解决方案转换为另一个解决方案)。
任何帮助表示赞赏!
谢谢!
答案 0 :(得分:1)
使用LAG函数时,我们需要首先为每个成员和月份创建一条记录,使用LAG函数获取重要的活动月份,最后使用where子句仅获取我们想要的内容:
DECLARE
@year int = 2018,
@month int = 7;
WITH
monthwise (MemberID, FirstOfMonth) AS (
SELECT DISTINCT MemberID, DATEADD(month, DATEDIFF(month, 0, ActivityDate), 0)
FROM Activities
),
prevActivity (MemberID, FirstOfMonth, prevFirstOfMonth) AS (
SELECT MemberID, FirstOfMonth
, LAG(FirstOfMonth) OVER (PARTITION BY MemberID ORDER BY FirstOfMonth)
FROM monthwise
)
SELECT MemberID
FROM prevActivity
WHERE MONTH(FirstOfMonth) = @month
AND YEAR(FirstOfMonth) = @year
AND (prevFirstOfMonth IS NULL OR DATEDIFF(month, prevFirstOfMonth, FirstOfMonth) > 12)
您也可以不使用LAG功能来执行此操作:使用两个查询,一个查询用于本月活动的成员,一个查询用于在过去十二个月中活动的成员。然后使用内部联接和左联接查找本月活动的成员,而前几个月没有活动。
DECLARE
@year int = 2018,
@month int = 7;
WITH
this (MemberID) AS (
SELECT DISTINCT MemberID
FROM Activities
WHERE YEAR(ActivityDate) = @year
AND MONTH(ActivityDate) = @month
),
prev (MemberID) AS (
SELECT DISTINCT MemberID
FROM Activities
WHERE ActivityDate < DATEADD(month, @month-1 +12*(@year-1900), 0)
AND ActivityDate >= DATEADD(month, @month-1 +12*(@year-1901), 0)
)
SELECT m.MemberID
FROM Members m
INNER JOIN this ON m.MemberID = this.MemberID
LEFT JOIN prev ON m.MemberID = prev.MemberID
WHERE prev.MemberID IS NULL
答案 1 :(得分:-1)
您可以使用lag()
进行此操作:
select year(ts), month(ts),
(count(distinct memberid) -
count(distinct case when prev_ts > dateadd(year, -1, ts) then memberid)
) as
from (select memberid,
lag(ts) over (partition by memberid order by ts) as prev_ts
from activities a
) a
group by year(ts), month(ts);