SQL Server-在时间范围内选择频繁记录

时间:2020-09-28 15:24:13

标签: sql sql-server

我正在寻找一种选择所有记录的方法,其中一列具有相同的记录,并且日期时间在5分钟之内。我管理票务系统,并且试图查看在5分钟内创建的所有类似票证。参见下面的示例数据集:

|ID |Subject    |CreatedDateTime    |
-------------------------------------
|1  |A          |2020-09-28 11:01:00|
|2  |A          |2020-09-28 11:02:00|
|3  |A          |2020-09-28 11:03:00|
|4  |A          |2020-09-28 11:03:09|
|5  |A          |2020-09-28 11:04:52|
|6  |A          |2020-09-28 11:15:00|
|7  |B          |2020-09-28 11:20:00|
|8  |B          |2020-09-28 11:20:00|
|9  |B          |2020-09-28 11:20:00|

我的目标是仅选择记录1-5,因为5个相同的Subject,并且它们都是在彼此5分钟之内创建的。不应选择6-10,因为“主题”数量不够大或超出了指定的时间范围。

以下是我到目前为止所进行的测试查询,但未考虑5分钟的范围(我只回溯1周,因此该子句):

SELECT Subject,COUNT(*)
FROM TableName
WHERE CreatedDateTime > DATEADD(day, -7, GETDATE())
GROUP BY Subject
HAVING COUNT(*) > 5 
ORDER BY COUNT(*) DESC;

有什么办法只能在很短的时间内看到类似的记录吗?预先谢谢大家!

2 个答案:

答案 0 :(得分:1)

您可以在每条记录的前后移动峰值,找到以分钟为单位的时差,然后仅保留在5分钟内与至少一条其他记录相关的记录:

create table t(id int, start_letter varchar(1), end_letter varchar(1));
create table search_data(words varchar(50))
insert into t values(1,'A','R')

begin
insert into search_data values('Air');
insert into search_data values('Amour');
insert into search_data values('Arogant');
end;

有关运行示例数据的示例,请参见下面的演示链接。

Demo

答案 1 :(得分:0)

如果您希望对彼此之间5分钟之内的记录进行分组,则可以使用递归查询对日期范围进行“分组”,如下所示:

with data
  as (select *,row_number() over(partition by subject order by createddatetime) as rnk
        from t
        )
    ,cte
    as(select id,subject,createddatetime as begin_date, createddatetime,cast(1 as int) as grp
         from data
       where rnk=1  
       union all
       select b.id
             ,b.subject
             ,b.createddatetime
             ,case when datediff(minute,a.createddatetime,b.createddatetime) > 5 then 
                    b.createddatetime
                   else
                    a.createddatetime
               end as createddatetime
             ,case when datediff(minute,a.createddatetime,b.createddatetime) > 5 then 
                    a.grp+1
                   else
                    a.grp
               end as grp  
         from cte a
         join t b
           on a.id+1=b.id
          and a.subject=b.subject 
       )
     select * from cte  order by 1

 +----+---------+-------------------------+-------------------------+-----+
| id | subject |       begin_date        |     createddatetime     | grp |
+----+---------+-------------------------+-------------------------+-----+
|  1 | A       | 2020-09-28 11:01:00.000 | 2020-09-28 11:01:00.000 |   1 |
|  2 | A       | 2020-09-28 11:02:00.000 | 2020-09-28 11:01:00.000 |   1 |
|  3 | A       | 2020-09-28 11:03:00.000 | 2020-09-28 11:01:00.000 |   1 |
|  4 | A       | 2020-09-28 11:03:09.000 | 2020-09-28 11:01:00.000 |   1 |
|  5 | A       | 2020-09-28 11:04:52.000 | 2020-09-28 11:01:00.000 |   1 |
|  6 | A       | 2020-09-28 11:15:00.000 | 2020-09-28 11:15:00.000 |   2 |
|  6 | A       | 2020-09-28 11:17:00.000 | 2020-09-28 11:17:00.000 |   2 |
|  7 | B       | 2020-09-28 11:20:00.000 | 2020-09-28 11:20:00.000 |   1 |
|  8 | B       | 2020-09-28 11:20:00.000 | 2020-09-28 11:20:00.000 |   1 |
|  9 | B       | 2020-09-28 11:20:00.000 | 2020-09-28 11:20:00.000 |   1 |
+----+---------+-------------------------+-------------------------+-----+

db小提琴链接

https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=7fabc2d2753c2ab01bb40f58d82bf222

相关问题