用于将日期范围转换为每日记录的SQL查询

时间:2011-03-19 15:27:14

标签: tsql sql-server-2008-r2

要求

  1. 我有数据表,可以在日期范围内保存数据。
  2. 允许每条记录与先前的记录重叠(记录有CreatedOn datetime列。)
  3. 新记录可以定义它自己的日期范围,如果它需要可以重叠几个旧记录。
  4. 每个新的重叠记录都会覆盖它重叠的旧记录的设置。
  5. 结果集

    我需要获得的是获取使用记录重叠的任何日期范围的每日数据。它应该每天返回一个记录,并记录该特定日期的相应数据。

    要将范围转换为天,我想到 numbers/dates table 和用户定义的函数(UDF)来获取范围内每一天的数据,但我想知道是否还有其他(如更好 *甚至更快)这样做的方式,因为我使用的是最新的SQL Server 2008 R2。

    存储数据

    想象一下,我存储的数据看起来像这样

    ID | RangeFrom | RangeTo  | Starts | Ends  | CreatedOn (not providing data)
    ---|-----------|----------|--------|-------|-----------
    1  | 20110101  | 20110331 | 07:00  | 15:00
    2  | 20110401  | 20110531 | 08:00  | 16:00
    3  | 20110301  | 20110430 | 06:00  | 14:00 <- overrides both partially
    

    结果

    如果我想从2011年1月1日到2001年5月31日获取数据,结果表应如下所示(省略明显的行):

    DayDate | Starts | Ends
    --------|--------|------
    20110101| 07:00  | 15:00  <- defined by record ID = 1
    20110102| 07:00  | 15:00  <- defined by record ID = 1
    ...                          many rows omitted for obvious reasons
    20110301| 06:00  | 14:00  <- defined by record ID = 3
    20110302| 06:00  | 14:00  <- defined by record ID = 3
    ...                          many rows omitted for obvious reasons
    20110501| 08:00  | 16:00  <- defined by record ID = 2
    20110502| 08:00  | 16:00  <- defined by record ID = 2
    ...                          many rows omitted for obvious reasons
    20110531| 08:00  | 16:00  <- defined by record ID = 2
    

2 个答案:

答案 0 :(得分:6)

实际上,由于您正在使用日期,因此日历表会更有帮助。

Declare @StartDate date
Declare @EndDate date

;With Calendar As
    (
    Select @StartDate As [Date]
    Union All
    Select DateAdd(d,1,[Date])
    From Calendar
    Where [Date] < @EndDate
    )
Select ...
From Calendar
    Left Join MyTable
        On Calendar.[Date] Between MyTable.Start And MyTable.End
Option ( Maxrecursion 0 );

<强>加成

错过了原帖中关于特朗普规则的部分:

Set DateFormat MDY;
Declare @StartDate date = '20110101';
Declare @EndDate date = '20110501';

-- This first CTE is obviously to represent
-- the source table
With SampleData As 
    (
    Select 1 As Id
        , Cast('20110101' As date) As RangeFrom
        , Cast('20110331' As date) As RangeTo
        , Cast('07:00' As time) As Starts
        , Cast('15:00' As time) As Ends
        , CURRENT_TIMESTAMP As CreatedOn
    Union All Select 2, '20110401', '20110531', '08:00', '16:00', DateAdd(s,1,CURRENT_TIMESTAMP )
    Union All Select 3, '20110301', '20110430', '06:00', '14:00', DateAdd(s,2,CURRENT_TIMESTAMP )
    )
    , Calendar As
    (
    Select @StartDate As [Date]
    Union All
    Select DateAdd(d,1,[Date])
    From Calendar
    Where [Date] < @EndDate
    )
    , RankedData As
    (
    Select C.[Date]
        , S.Id
        , S.RangeFrom, S.RangeTo, S.Starts, S.Ends
        , Row_Number() Over( Partition By C.[Date] Order By S.CreatedOn Desc ) As Num
    From Calendar As C
        Join SampleData As S
            On C.[Date] Between S.RangeFrom And S.RangeTo
    )
Select [Date], Id, RangeFrom, RangeTo, Starts, Ends
From RankedData
Where Num = 1   
Option ( Maxrecursion 0 );

简而言之,我对所有样本数据进行排名,他们更喜欢与同一日期重叠的较新行。

答案 1 :(得分:1)

为什么在内存中做得更好的时候才能在数据库中完成

这是解决方案(我最终使用的)在数据传输,速度和资源方面似乎最合理。

  1. 从DB到中间层(较小的数据量)获取实际范围定义
  2. 在内存日历中生成某个日期范围(比在DB中更快)
  3. 将这些数据库定义放入(比DB更容易,更快)
  4. 就是这样。我意识到,如果你在内存代码中有可执行文件可以更快更有效地执行相同的操作,那么在DB中复杂某些东西并不值得。