使用sql查找日期范围内的最小日期

时间:2011-11-10 17:38:44

标签: sql sql-server-2008

所以,我有一些有开始日期和结束日期的历史表。问题是,这个表中有几条记录引用相同的东西,但它们的开始日期和结束日期并不准确。所以,我正在努力统一他们的开始和结束日期。

因此,每组记录都有接近的开始和结束日期(大约7秒内)。然后会有另一个集群,具有相同的密钥(在本例中为VoyageID),但是有一组不同的关闭开始日期和结束日期。那有意义吗?如果没有,我可以发布一些样本数据。

无论如何,我现在的目标是找到每个群集的最小开始日期。我现在拥有的每个VoyageID的最低要求。任何帮助,将不胜感激。谢谢!

这就是我所拥有的:

DECLARE @7S DATETIME
SET @7S = '0:0:07'

PRINT @7S

SELECT MAX(T1.BeginDate), T1.VoyageID FROM
hist.VoyageProfitLossValues T1 INNER JOIN
hist.VoyageProfitLossValues T2 ON
T1.VoyageID = T2.VoyageID AND
T1.BeginDate BETWEEN (T2.BeginDate - @7S) and (T2.BeginDate + @7S)
GROUP BY T1.VoyageID

编辑:示例数据:

BeginDate                   EndDate                    VoyageID
2011-07-05 07:02:50.713     2011-07-05 07:25:53.007    6312
2011-07-05 07:02:50.870     2011-07-05 07:25:53.693    6312
2011-07-05 07:02:51.027     2011-07-05 07:25:54.387    6312
2011-07-08 14:22:21.147     NULL                       6312
2011-07-08 14:22:21.163     NULL                       6312
2011-07-08 14:22:21.177     NULL                       6312

注意:每次航行的实际数据超过3次,而且BeginDates可以更远。

我希望不用这个:

BeginDate                   VoyageID
2011-07-05 07:02:50.713     6312
2011-07-08 14:22:21.147     6312

我所拥有的只是给我第一行。

我最终也会使用结束日期,但我可以轻松地将其转换为另一个。

2 个答案:

答案 0 :(得分:2)

此解决方案的想法是为每个BeginDateVoyageID上订购行。从顶部开始,选择时差超过7秒的行到上一行。

@Voy代替hist.VoyageProfitLossValues。首先,我创建一个临时表#T,它将为ID列填充每个VoyageID的有序值。 C是一个递归CTE,从ID = 1开始,遍历所有行,将当前行与前一行进行比较,并将结果存储在列FirstDate中。我在示例数据中添加了第二个VoyageID,以证明它也适用于此。

declare @Voy table
(
  BeginDate datetime,
  EndDate datetime,
  VoyageID int
)

insert into @Voy values  
('2011-07-05 07:02:50.713',     '2011-07-05 07:25:53.007',    6312),
('2011-07-05 07:02:50.870',     '2011-07-05 07:25:53.693',    6312),
('2011-07-05 07:02:51.027',     '2011-07-05 07:25:54.387',    6312),
('2011-07-08 14:22:21.147',      NULL                    ,    6312),
('2011-07-08 14:22:21.163',      NULL                    ,    6312),
('2011-07-08 14:22:21.177',      NULL                    ,    6312),
('2011-07-05 07:02:50.713',     '2011-07-05 07:25:53.007',    6313),
('2011-07-05 07:02:50.870',     '2011-07-05 07:25:53.693',    6313),
('2011-07-05 07:02:51.027',     '2011-07-05 07:25:54.387',    6313),
('2011-07-08 14:22:21.147',      NULL                    ,    6313),
('2011-07-08 14:22:21.163',      NULL                    ,    6313),
('2011-07-08 14:22:21.177',      NULL                    ,    6313)


create table #T
(
  ID int,
  VoyageID int,
  BeginDate datetime
  primary key (ID, VoyageID)
)

insert into #T (ID, VoyageID, BeginDate)
select row_number() over(partition by VoyageID order by BeginDate),
       VoyageID,
       BeginDate
from @Voy     


;with C as
(
  select T.ID,
         T.VoyageID,
         T.BeginDate,
         1 as FirstDate
  from #T as T
  where T.ID = 1
  union all
  select T.ID,
         T.VoyageID,
         T.BeginDate,
         case when datediff(second, C.BeginDate, T.BeginDate) > 7 then 1 else 0 end
  from #T as T
    inner join C
      on T.ID = C.ID + 1 and
         T.VoyageID = C.VoyageID
)
select C.BeginDate,
       C.VoyageID
from C
where C.FirstDate = 1
order by C.VoyageID,
         C.BeginDate
option (maxrecursion 0)


drop table #T

结果:

BeginDate               VoyageID
----------------------- -----------
2011-07-05 07:02:50.713 6312
2011-07-08 14:22:21.147 6312
2011-07-05 07:02:50.713 6313
2011-07-08 14:22:21.147 6313

答案 1 :(得分:0)

此方法使用Cursor。我不知道它是否适合您:

create table #datacluster ( 
    dateCluster datetime, 
    dateV datetime primary key)

DECLARE @7S DATETIME
DECLARE @base DATETIME
DECLARE @begindate DATETIME

SELECT @base = SYSDATETIME()
SET @7S = '0:0:07'

DECLARE cursor1 CURSOR 
FAST_FORWARD READ_ONLY FOR    
SELECT distinct T1.BeginDate 
FROM
  hist.VoyageProfitLossValues T1 
ORDER BY  T1.BeginDate DESC

FETCH NEXT FROM cursor1 
INTO @begindate;    

WHILE @@FETCH_STATUS = 0
BEGIN

  IF @base - @7S > @begindate
  BEGIN
    set @base = @begindate
  END
  insert into #datacluster ( dateCluster, dateV) 
  values (@base,  @begindate)

  FETCH NEXT FROM cursor1 
  INTO @begindate;    
END

从#dataCluster更新VoyageProfitLossValues表:

UPDATE hist.VoyageProfitLossValues 
SET BeginDate = (
   SELECT C.BeginDate 
   FROM #datacluster C 
   WHERE 
      C.dateV = hist.VoyageProfitLossValues.BeginDate 
  )

注1:未经测试!!

<强>优化

临时表上的主键。 快进只读光标。