SQL用于确定连续访问天数的不同时段?

时间:2009-07-24 10:27:57

标签: sql sql-server date

Jeff最近问this question并得到了一些很好的答案。

Jeff的问题围绕着找到已经连续(n)天登录系统的用户。使用数据库表结构如下:

Id      UserId   CreationDate
------  ------   ------------
750997      12   2009-07-07 18:42:20.723
750998      15   2009-07-07 18:42:20.927
751000      19   2009-07-07 18:42:22.283

首先阅读the original question以获得清晰度然后......

我对确定用户有多少不同(n)天的问题很感兴趣。

是否可以制作一个快速的SQL查询,它可以返回用户列表和他们拥有的不同(n)天的数量?

编辑:根据下面的评论如果有人连续2天,那么差距,那么连续4天,那么差距,然后连续8天。这将是3个“不同的4天时期”。 8天的时间应该算作两个背靠背的4天时间。

3 个答案:

答案 0 :(得分:1)

我的回答似乎没有出现......

我会再试一次......

Rob Farley对原始问题的回答有一个方便的好处,包括连续几天。

with numberedrows as
(
        select row_number() over (partition by UserID order by CreationDate) - cast(CreationDate-0.5 as int) as TheOffset, CreationDate, UserID
        from tablename
)
select min(CreationDate), max(CreationDate), count(*) as NumConsecutiveDays, UserID
from numberedrows
group by UserID, TheOffset

使用整数除法,只需将连续天数除以给出整个连续周期所涵盖的“不同(n) - 天期”的数量...
- 2/4 = 0
- 4/4 = 1
- 8/4 = 2
- 9/4 = 2
- 等等

所以这是我对Rob的回答,以满足您的需求...... (我真的很喜欢Rob's answer,去阅读解释,这是灵感的思考!)

with
    numberedrows (
        UserID,
        TheOffset
    )
as
(
    select
        UserID,
        row_number() over (partition by UserID order by CreationDate)
            - DATEDIFF(DAY, 0, CreationDate) as TheOffset
    from
        tablename
),
    ConsecutiveCounts(
        UserID,
        ConsecutiveDays
    )
as
(
    select
        UserID,
        count(*) as ConsecutiveDays
    from
        numberedrows
    group by
        UserID,
        TheOffset
)
select
    UserID,
    SUM(ConsecutiveDays / @period_length) AS distinct_n_day_periods
from
    ConsecutiveCounts
group by
    UserID

唯一真正的区别在于我采用Rob的结果,然后通过另一个GROUP BY ...

运行它

答案 1 :(得分:1)

所以 - 我将从上一个问题开始查询,该问题列出了连续几天的每一轮。然后我将通过userid和NumConsecutiveDays对其进行分组,以计算这些用户的天数。

with numberedrows as
(
        select row_number() over (partition by UserID order by CreationDate) - cast(CreationDate-0.5 as int) as TheOffset, CreationDate, UserID
        from tablename
)
,
runsOfDay as
(
select min(CreationDate), max(CreationDate), count(*) as NumConsecutiveDays, UserID
from numberedrows
group by UserID, TheOffset
)
select UserID, NumConsecutiveDays, count(*) as NumOfRuns
from runsOfDays
group by UserID, NumConsecutiveDays
;

当然,如果你想过滤这个只考虑一定长度的游戏,那么在最后一个查询中加上“where NumConsecutiveDays> = @days”。

现在,如果您想将16天的运行计为三次5天运行,那么每次运行将计为这些运行的NumConsecutiveDays / @runlength(对于每个整数将向下舍入)。所以现在不要只计算每个中有多少,而是使用SUM。您可以使用上面的查询并使用SUM(NumOfRuns * NumConsecutiveDays / @runlength),但如果您了解逻辑,那么下面的查询会更容易一些。

with numberedrows as
(
        select row_number() over (partition by UserID order by CreationDate) - cast(CreationDate-0.5 as int) as TheOffset, CreationDate, UserID
        from tablename
)
,
runsOfDay as
(
select min(CreationDate), max(CreationDate), count(*) as NumConsecutiveDays, UserID
from numberedrows
group by UserID, TheOffset
)
select UserID, sum(NumConsecutiveDays / @runlength) as NumOfRuns
from runsOfDays
where NumConsecutiveDays >= @runlength
group by UserID
;

希望这有帮助,

罗布

答案 2 :(得分:0)

这与我的测试数据非常吻合。

DECLARE @days int
SET @days = 30

SELECT DISTINCT l.UserId, (datediff(d,l.CreationDate, -- Get first date in contiguous range
(
    SELECT min(a.CreationDate ) as CreationDate
    FROM UserHistory a
        LEFT OUTER JOIN UserHistory b 
            ON a.CreationDate = dateadd(day, -1, b.CreationDate ) AND
            a.UserId = b.UserId
    WHERE b.CreationDate IS NULL AND
        a.CreationDate >= l.CreationDate AND
        a.UserId = l.UserId
) )+1)/@days as cnt
INTO #cnttmp
FROM UserHistory l
    LEFT OUTER JOIN UserHistory r 
        ON r.CreationDate = dateadd(day, -1, l.CreationDate ) AND
        r.UserId = l.UserId
WHERE r.CreationDate IS NULL
ORDER BY l.UserId

SELECT UserId, sum(cnt)
FROM #cnttmp
GROUP BY UserId
HAVING sum(cnt) > 0