使用SQL查找冲突的日期间隔

时间:2011-05-17 13:06:14

标签: sql sql-server tsql

假设我在Sql Server 2008中有以下表:

ItemId StartDate   EndDate
1      NULL        2011-01-15
2      2011-01-16  2011-01-25
3      2011-01-26  NULL

如您所见,此表包含StartDate和EndDate列。我想验证这些列中的数据。间隔不能相互冲突。因此,上表是有效的,但下一个表无效,因为第一行的结束日期大于第二行中的StartDate。

ItemId StartDate   EndDate
1      NULL        2011-01-17
2      2011-01-16  2011-01-25
3      2011-01-26  NULL

NULL在这里意味着无限。

你能帮我写一个数据验证脚本吗?

[第二项任务]

感谢您的回答。 我有一个并发症。我们假设,我有这样的表:

ItemId  IntervalId StartDate   EndDate
1       1          NULL        2011-01-15
2       1          2011-01-16  2011-01-25
3       1          2011-01-26  NULL
4       2          NULL        2011-01-17
5       2          2011-01-16  2011-01-25
6       2          2011-01-26  NULL

在这里,我想验证IntervalId组内的间隔,但不是整个表中的间隔。因此,间隔1将有效,但间隔2将无效。

还有。是否可以向表中添加约束以避免此类无效记录?

[最终解决方案]

我创建了一个函数来检查间隔是否存在冲突:

CREATE FUNCTION [dbo].[fnIntervalConflict]
(
    @intervalId INT,
    @originalItemId INT,
    @startDate DATETIME,
    @endDate DATETIME
)
RETURNS BIT
AS
BEGIN

    SET @startDate = ISNULL(@startDate,'1/1/1753 12:00:00 AM')
    SET @endDate = ISNULL(@endDate,'12/31/9999 11:59:59 PM')

    DECLARE @conflict BIT = 0

    SELECT TOP 1 @conflict = 1
    FROM Items
    WHERE IntervalId = @intervalId
    AND ItemId <> @originalItemId
    AND (
    (ISNULL(StartDate,'1/1/1753 12:00:00 AM') >= @startDate 
     AND ISNULL(StartDate,'1/1/1753 12:00:00 AM') <= @endDate)
     OR (ISNULL(EndDate,'12/31/9999 11:59:59 PM') >= @startDate 
     AND ISNULL(EndDate,'12/31/9999 11:59:59 PM') <= @endDate)
    )

    RETURN @conflict
END

然后我在表格中添加了2个约束:

ALTER TABLE dbo.Items ADD CONSTRAINT
    CK_Items_Dates CHECK (StartDate IS NULL OR EndDate IS NULL OR StartDate <= EndDate)

GO

ALTER TABLE dbo.Items ADD CONSTRAINT
    CK_Items_ValidInterval CHECK (([dbo].[fnIntervalConflict]([IntervalId], ItemId,[StartDate],[EndDate])=(0)))

GO

我知道,第二个约束会减慢插入和更新操作,但对我的应用程序来说并不重要。 而且,现在我可以在表格中插入和更新数据之前从我的应用程序代码中调用函数fnIntervalConflict

5 个答案:

答案 0 :(得分:4)

这样的事情会给你所有重叠的时期

SELECT
* 
FROM
mytable t1 
JOIN mytable t2 ON t1.EndDate>t2.StartDate AND t1.StartDate < t2.StartDate 

编辑Adrian的评论吼叫

答案 1 :(得分:3)

这将为您提供不正确的行。

添加了ROW_NUMBER()因为我不知道所有条目是否按顺序排列。

-- Testdata
declare @date datetime = '2011-01-17'

;with yourTable(itemID, startDate, endDate)
as
(
    SELECT  1,  NULL, @date
    UNION ALL
    SELECT  2,  dateadd(day, -1, @date),    DATEADD(day, 10, @date)
    UNION ALL
    SELECT  3,  DATEADD(day, 60, @date),    NULL
)

-- End testdata

,tmp
as
(
    select  *
            ,ROW_NUMBER() OVER(order by startDate) as rowno 
    from    yourTable
)

select  *
from    tmp t1
left join   tmp t2
    on t1.rowno = t2.rowno - 1
where   t1.endDate > t2.startDate

修改 至于更新的问题:

只需在PARTITION BY查询中添加ROW_NUMBER()子句并更改连接。

-- Testdata
declare @date datetime = '2011-01-17'

;with yourTable(itemID, startDate, endDate, intervalID)
as
(
    SELECT  1,  NULL, @date, 1
    UNION ALL
    SELECT  2,  dateadd(day, 1, @date), DATEADD(day, 10, @date),1
    UNION ALL
    SELECT  3,  DATEADD(day, 60, @date),    NULL,   1
    UNION ALL
    SELECT  4,  NULL, @date, 2
    UNION ALL
    SELECT  5,  dateadd(day, -1, @date),    DATEADD(day, 10, @date),2
    UNION ALL
    SELECT  6,  DATEADD(day, 60, @date),    NULL,   2
)

-- End testdata

,tmp
as
(
    select  *
            ,ROW_NUMBER() OVER(partition by intervalID order by startDate) as rowno 
    from    yourTable
)

select  *
from    tmp t1
left join   tmp t2
    on t1.rowno = t2.rowno - 1
    and t1.intervalID = t2.intervalID
where   t1.endDate > t2.startDate

答案 2 :(得分:2)

declare @T table (ItemId int, IntervalID int, StartDate datetime,   EndDate datetime)

insert into @T
select 1, 1,  NULL,        '2011-01-15' union all
select 2, 1, '2011-01-16', '2011-01-25' union all
select 3, 1, '2011-01-26',  NULL        union all
select 4, 2,  NULL,        '2011-01-17' union all
select 5, 2, '2011-01-16', '2011-01-25' union all
select 6, 2, '2011-01-26',  NULL

select T1.*
from @T as T1
  inner join @T as T2
    on coalesce(T1.StartDate, '1753-01-01') < coalesce(T2.EndDate, '9999-12-31') and
       coalesce(T1.EndDate, '9999-12-31') > coalesce(T2.StartDate, '1753-01-01') and
       T1.IntervalID = T2.IntervalID and
       T1.ItemId <> T2.ItemId

结果:

ItemId      IntervalID  StartDate               EndDate
----------- ----------- ----------------------- -----------------------
5           2           2011-01-16 00:00:00.000 2011-01-25 00:00:00.000
4           2           NULL                    2011-01-17 00:00:00.000

答案 3 :(得分:2)

与OP没有直接关系,但自从Adrian表达了兴趣。这是一个比SQL Server保持完整性的表,确保在任何时候只存在一个有效值。在这种情况下,我正在处理当前/历史表,但是可以修改示例以使用未来的数据(尽管在这种情况下,您不能拥有索引视图,并且您需要直接编写合并,而不是维持触发器。)

在这种特殊情况下,我正在处理一个我想跟踪其历史记录的链接表。首先,我们要链接的表格:

create table dbo.Clients (
    ClientID int IDENTITY(1,1) not null,
    Name varchar(50) not null,
    /* Other columns */
    constraint PK_Clients PRIMARY KEY (ClientID)
)
go
create table dbo.DataItems (
    DataItemID int IDENTITY(1,1) not null,
    Name varchar(50) not null,
    /* Other columns */
    constraint PK_DataItems PRIMARY KEY (DataItemID),
    constraint UQ_DataItem_Names UNIQUE (Name)
)
go

现在,如果我们正在构建一个普通表,我们将拥有以下内容(不要运行此):

create table dbo.ClientAnswers (
    ClientID int not null,
    DataItemID int not null,
    IntValue int not null,
    Comment varchar(max) null,
    constraint PK_ClientAnswers PRIMARY KEY (ClientID,DataItemID),
    constraint FK_ClientAnswers_Clients FOREIGN KEY (ClientID) references dbo.Clients (ClientID),
    constraint FK_ClientAnswers_DataItems FOREIGN KEY (DataItemID) references dbo.DataItems (DataItemID)
)

但是,我们想要一张可以代表完整历史的表格。特别是,我们希望设计结构,使得重叠的时间段永远不会出现在数据库中。我们总是知道哪个记录在任何特定时间都有效:

create table dbo.ClientAnswerHistories (
    ClientID int not null,
    DataItemID int not null,
    IntValue int null,
    Comment varchar(max) null,

    /* Temporal columns */
    Deleted bit not null,
    ValidFrom datetime2 null,
    ValidTo datetime2 null,
    constraint UQ_ClientAnswerHistories_ValidFrom UNIQUE (ClientID,DataItemID,ValidFrom),
    constraint UQ_ClientAnswerHistories_ValidTo UNIQUE (ClientID,DataItemID,ValidTo),
    constraint CK_ClientAnswerHistories_NoTimeTravel CHECK (ValidFrom < ValidTo),
    constraint FK_ClientAnswerHistories_Clients FOREIGN KEY (ClientID) references dbo.Clients (ClientID),
    constraint FK_ClientAnswerHistories_DataItems FOREIGN KEY (DataItemID) references dbo.DataItems (DataItemID),
    constraint FK_ClientAnswerHistories_Prev FOREIGN KEY (ClientID,DataItemID,ValidFrom)
        references dbo.ClientAnswerHistories (ClientID,DataItemID,ValidTo),
    constraint FK_ClientAnswerHistories_Next FOREIGN KEY (ClientID,DataItemID,ValidTo)
        references dbo.ClientAnswerHistories (ClientID,DataItemID,ValidFrom),
    constraint CK_ClientAnswerHistory_DeletionNull CHECK (
        Deleted = 0 or
        (
            IntValue is null and
            Comment is null
        )),
    constraint CK_ClientAnswerHistory_IntValueNotNull CHECK (Deleted=1 or IntValue is not null)
)
go

这是很多限制因素。维护此表的唯一方法是通过合并语句(请参阅下面的示例,并尝试推断自己的原因)。我们现在要构建一个模仿上面定义的ClientAnswers表的视图:

create view dbo.ClientAnswers
with schemabinding
as
    select
        ClientID,
        DataItemID,
        ISNULL(IntValue,0) as IntValue,
        Comment
    from
        dbo.ClientAnswerHistories
    where
        Deleted = 0 and
        ValidTo is null
go
create unique clustered index PK_ClientAnswers on dbo.ClientAnswers (ClientID,DataItemID)
go

我们有最初想要的PK约束。我们还使用ISNULL来恢复not null列的IntValue - 即使检查约束已经保证了这一点,SQL Server也无法获取此信息。如果我们正在使用ORM,我们会将其定位到ClientAnswers,并自动构建历史记录。接下来,我们可以拥有一个让我们回顾过去的功能:

create function dbo.ClientAnswers_At (
    @At datetime2
)
returns table
with schemabinding
as
    return (
        select
            ClientID,
            DataItemID,
            ISNULL(IntValue,0) as IntValue,
            Comment
        from
            dbo.ClientAnswerHistories
        where
            Deleted = 0 and
            (ValidFrom is null or ValidFrom <= @At) and
            (ValidTo is null or ValidTo > @At)
    )
go

最后,我们需要构建此历史记录的ClientAnswers上的触发器。我们需要使用合并语句,因为我们需要同时插入新行,并使用新的ValidTo值更新之前的“有效”行以结束日期。

create trigger T_ClientAnswers_I
on dbo.ClientAnswers
instead of insert
as
    set nocount on
    ;with Dup as (
        select i.ClientID,i.DataItemID,i.IntValue,i.Comment,CASE WHEN cah.ClientID is not null THEN 1 ELSE 0 END as PrevDeleted,t.Dupl
        from
            inserted i
                left join
            dbo.ClientAnswerHistories cah
                on
                    i.ClientID = cah.ClientID and
                    i.DataItemID = cah.DataItemID and
                    cah.ValidTo is null and
                    cah.Deleted = 1
                cross join
            (select 0 union all select 1) t(Dupl)
    )
    merge into dbo.ClientAnswerHistories cah
    using Dup on cah.ClientID = Dup.ClientID and cah.DataItemID = Dup.DataItemID and cah.ValidTo is null and Dup.Dupl = 0 and Dup.PrevDeleted = 1
    when matched then update set ValidTo = SYSDATETIME()
    when not matched and Dup.Dupl=1 then insert (ClientID,DataItemID,IntValue,Comment,Deleted,ValidFrom)
    values (Dup.ClientID,Dup.DataItemID,Dup.IntValue,Dup.Comment,0,CASE WHEN Dup.PrevDeleted=1 THEN SYSDATETIME() END);
go
create trigger T_ClientAnswers_U
on dbo.ClientAnswers
instead of update
as
    set nocount on
    ;with Dup as (
        select i.ClientID,i.DataItemID,i.IntValue,i.Comment,t.Dupl
        from
            inserted i
                cross join
            (select 0 union all select 1) t(Dupl)
    )
    merge into dbo.ClientAnswerHistories cah
    using Dup on cah.ClientID = Dup.ClientID and cah.DataItemID = Dup.DataItemID and cah.ValidTo is null and Dup.Dupl = 0
    when matched then update set ValidTo = SYSDATETIME()
    when not matched then insert (ClientID,DataItemID,IntValue,Comment,Deleted,ValidFrom)
    values (Dup.ClientID,Dup.DataItemID,Dup.IntValue,Dup.Comment,0,SYSDATETIME());
go
create trigger T_ClientAnswers_D
on dbo.ClientAnswers
instead of delete
as
    set nocount on
    ;with Dup as (
        select d.ClientID,d.DataItemID,t.Dupl
        from
            deleted d
                cross join
            (select 0 union all select 1) t(Dupl)
    )
    merge into dbo.ClientAnswerHistories cah
    using Dup on cah.ClientID = Dup.ClientID and cah.DataItemID = Dup.DataItemID and cah.ValidTo is null and Dup.Dupl = 0
    when matched then update set ValidTo = SYSDATETIME()
    when not matched then insert (ClientID,DataItemID,Deleted,ValidFrom)
    values (Dup.ClientID,Dup.DataItemID,1,SYSDATETIME());
go

显然,我可以构建一个更简单的表(不是连接表),但这是我的标准例子(虽然我花了一些时间来重构它 - 我忘记了set nocount on语句而)。但这里的优势在于,基表ClientAnswerHistories 无法存储相同ClientIDDataItemID值的重叠时间范围。

当你需要处理临时外键时,事情变得更加复杂。


当然,如果您不想要任何真正的差距,那么您可以删除Deleted列(以及相关的检查),真正使not nullnot null,修改insert触发执行普通插入,并使delete触发器引发错误。

答案 4 :(得分:0)

如果我的数据永远不会有重叠间隔,我总是对设计采取略微不同的方法......即不存储间隔,而只是存储开始时间。然后,有一个有助于显示间隔的视图。

CREATE TABLE intervalStarts
(
  ItemId      int,
  IntervalId  int,
  StartDate   datetime
)

CREATE VIEW intervals
AS
with cte as (
  select ItemId, IntervalId, StartDate,
     row_number() over(partition by IntervalId order by isnull(StartDate,'1753-01-01')) row
  from intervalStarts
)
select c1.ItemId, c1.IntervalId, c1.StartDate,
  dateadd(dd,-1,c2.StartDate) as 'EndDate'
from cte c1
  left join cte c2 on c1.IntervalId=c2.IntervalId
                    and c1.row=c2.row-1

因此,样本数据可能如下所示:

INSERT INTO intervalStarts
select 1, 1, null union
select 2, 1, '2011-01-16' union
select 3, 1, '2011-01-26' union
select 4, 2, null union
select 5, 2, '2011-01-26' union
select 6, 2, '2011-01-14'

和一个简单的SELECT * FROM intervals产生:

ItemId | IntervalId | StartDate  | EndDate
1      | 1          | null       | 2011-01-15
2      | 1          | 2011-01-16 | 2011-01-25
3      | 1          | 2011-01-26 | null
4      | 2          | null       | 2011-01-13
6      | 2          | 2011-01-14 | 2011-01-25
5      | 2          | 2011-01-26 | null