在SQL中查找最长的序列

时间:2015-08-27 13:53:00

标签: sql-server tsql

如果我有一个包含日期的表格,例如(以年 - 月 - 日为时间格式):

2015-06-22 12:39:11.257
2015-06-22 15:44:46.790
2015-06-22 15:48:50.583
2015-06-23 08:25:50.060
2015-07-01 07:11:37.037
2015-07-07 13:40:11.997
2015-07-08 13:12:08.723
2015-07-08 13:12:13.900
2015-07-08 13:12:16.010
2015-07-10 12:29:59.777
2015-07-13 15:42:49.077
2015-07-13 15:47:48.670
2015-07-13 15:47:51.547
2015-07-14 08:11:53.023
2015-07-14 08:14:21.243
2015-07-14 08:16:49.410
2015-07-14 08:17:11.997
2015-07-14 09:58:28.840
2015-07-14 09:59:34.640
2015-07-15 15:39:39.993
2015-07-17 08:45:20.157
2015-07-24 14:00:00.487
2015-07-24 14:03:53.773
2015-07-24 14:12:41.717
2015-07-24 14:13:33.957
2015-07-24 14:15:40.953
2015-08-25 12:43:03.920

...有没有办法(在SQL中)我能找到最长的连续日子。我只需要总天数。所以在上面,有6月22日和6月23日的条目,所以序列有2天。 7月13日,7月14日和7月15日也有参赛作品;这是最长的序列 - 3天。我不在乎时间部分,所以在午夜之前输入一个条目,之后的条目将被视为2天。

所以我想要一些可以查看表的SQL,并返回上面的值。

2 个答案:

答案 0 :(得分:7)

无需游标或任何类型的递归来解决此问题。你可以使用间隙和岛屿技术来做到这一点。这将从您的样本数据中生成所需的输出。

with SomeDates as
(
    select cast('2015-06-22 12:39:11.257' as datetime) as MyDate union all
    select '2015-06-22 15:44:46.790' union all
    select '2015-06-22 15:48:50.583' union all
    select '2015-06-23 08:25:50.060' union all
    select '2015-07-01 07:11:37.037' union all
    select '2015-07-07 13:40:11.997' union all
    select '2015-07-08 13:12:08.723' union all
    select '2015-07-08 13:12:13.900' union all
    select '2015-07-08 13:12:16.010' union all
    select '2015-07-10 12:29:59.777' union all
    select '2015-07-13 15:42:49.077' union all
    select '2015-07-13 15:47:48.670' union all
    select '2015-07-13 15:47:51.547' union all
    select '2015-07-14 08:11:53.023' union all
    select '2015-07-14 08:14:21.243' union all
    select '2015-07-14 08:16:49.410' union all
    select '2015-07-14 08:17:11.997' union all
    select '2015-07-14 09:58:28.840' union all
    select '2015-07-14 09:59:34.640' union all
    select '2015-07-15 15:39:39.993' union all
    select '2015-07-17 08:45:20.157' union all
    select '2015-07-24 14:00:00.487' union all
    select '2015-07-24 14:03:53.773' union all
    select '2015-07-24 14:12:41.717' union all
    select '2015-07-24 14:13:33.957' union all
    select '2015-07-24 14:15:40.953' union all
    select '2015-08-25 12:43:03.920'
)
, GroupedDates as
(
    select cast(MyDate as DATE) as MyDate
        , DATEADD(day, - ROW_NUMBER() over (Order by cast(MyDate as DATE)), cast(MyDate as DATE)) as DateGroup
    from SomeDates
    group by cast(MyDate as DATE)
)
, SortedDates as
(
    select DATEDIFF(day, min(MyDate), MAX(MyDate)) + 1 as GroupCount
        , min(MyDate) as StartDate
        , MAX(MyDate) as EndDate
    from GroupedDates
    group by DateGroup  
)

select top 1 GroupCount
    , StartDate
    , EndDate
from SortedDates
order by GroupCount desc

答案 1 :(得分:1)

这里的输入实际上是:

select trunc(date_column,'DD') day
from your_table
group by trunc(date_column,'DD');

从这一点开始,我可以将日期视为数字,以便更容易输入数据,而您的问题是找到最长的连续序列。

所以,输入表:

create table a(
col integer);

insert into a values (1);
insert into a values (2);
insert into a values (4);
insert into a values (5);
insert into a values (6);
insert into a values (8);
insert into a values (9);
insert into a values (11);
insert into a values (13);
insert into a values (14);
insert into a values (17);

使用此查询,您将获得从每一行开始的最长序列:

with s(col, i) as (
  select col, 1 i from a
  union all
  select a.col, i + 1
  from s join a on s.col = a.col+1
  )
  --select * from s
  select col, max(i) 
  from s 
  group by col
  order by col
  ;

结果:

col max
1   2
2   1
4   3
5   2
6   1
8   2
9   1
11  1
13  2
14  1
17  1

从这一点开始,您可以轻松选择最大值。此外,对于日期,您可以使用dateadd(dd,1,date_column)

递归CTE的解释:对于每一行,我会找到(如果存在)下一行并递增列i。当没有“下一行”时,递归退出。

OBS :我相信代码可以改进,但你有了想法。

SQLFIDDLE

更新为了提高效果并继续使用递归,我们只能从没有先前连续数字的数字开始。

with p as (
  select * from (
    select col, coalesce(col - (lag(col) over (order by col)),2) as has_prev 
    from a
    ) b
  where has_prev != 1
),
 s(col, i) as (
  select col, 1 i from p
  union all
  select s.col, i + 1
  from s join a on s.col+i = a.col
  )
  --select * from p
  select col, max(i) 
  from s 
  group by col
  order by col
  ;

SQLFIDDLE2