跨多个列实现相互唯一性

时间:2014-07-03 20:24:19

标签: sql sql-server tsql

我正在尝试找到一种直观的方法,在表格中的两列中强制实现相互唯一性。我不是在寻找复合唯一性,其中不允许使用重复的组合键;相反,我想要一个规则,其中任何键都不能再出现在 列中。请看以下示例:

CREATE TABLE Rooms
(
    Id INT NOT NULL PRIMARY KEY,
)

CREATE TABLE Occupants
(
    PersonName VARCHAR(20),
    LivingRoomId INT NULL REFERENCES Rooms (Id),
    DiningRoomId INT NULL REFERENCES Rooms (Id),
)

一个人可以选择任何房间作为他们的起居室,任何其他房间作为他们的餐厅。一旦房间被分配给一个房间,它就不能再分配给另一个人(无论是作为起居室还是作为餐厅)。

我知道这个问题可以通过数据规范化来解决;但是,我不能更改架构对架构进行重大更改。

更新:回应建议的答案:

两个唯一约束(或两个唯一索引)不会阻止两列重复。同样,简单的LivingRoomId != DiningRoomId检查约束也不会阻止之间的重复。例如,我希望禁止以下数据:

INSERT INTO Rooms VALUES (1), (2), (3), (4)
INSERT INTO Occupants VALUES ('Alex',    1, 2)
INSERT INTO Occupants VALUES ('Lincoln', 2, 3)

2号房间由Alex(作为起居室)和林肯(作为餐厅)同时占用;这不应该被允许。

更新 2 :我对三个主要建议的解决方案进行了一些测试,计算在Occupants中插入500,000行所需的时间表,每行都有一对随机的独特房间ID。

使用唯一索引和检查约束(调用标量函数)扩展Occupants表会导致插入大约三倍。标量函数的实现是不完整的,只检查新居住者的起居室是否与现有居住者的餐厅没有冲突。如果也进行了反向检查,我无法在合理的时间内完成插入。

添加将每个占用者的房间作为新行插入另一个表的触发器会使性能降低48%。同样,索引视图需要43%的时间。在我看来,使用索引视图更简洁,因为它避免了创建另一个表的需要,并且允许SQL Server自动处理更新和删除。

测试的完整脚本和结果如下:

SET STATISTICS TIME OFF
SET NOCOUNT ON

CREATE TABLE Rooms
(
    Id INT NOT NULL PRIMARY KEY IDENTITY(1,1),
    RoomName VARCHAR(10),
)

CREATE TABLE Occupants
(
    Id INT NOT NULL PRIMARY KEY IDENTITY(1,1),
    PersonName VARCHAR(10),
    LivingRoomId INT NOT NULL REFERENCES Rooms (Id),
    DiningRoomId INT NOT NULL REFERENCES Rooms (Id)
)

GO

DECLARE @Iterator INT = 0
WHILE (@Iterator < 10)
BEGIN
    INSERT INTO Rooms
    SELECT TOP (1000000) 'ABC'
    FROM sys.all_objects s1 WITH (NOLOCK)
        CROSS JOIN sys.all_objects s2 WITH (NOLOCK)
        CROSS JOIN sys.all_objects s3 WITH (NOLOCK);
    SET @Iterator = @Iterator + 1
END;

DECLARE @RoomsCount INT = (SELECT COUNT(*) FROM Rooms);

SELECT TOP 1000000 RoomId
INTO ##RandomRooms
FROM 
(
    SELECT DISTINCT
        CAST(RAND(CHECKSUM(NEWID())) * @RoomsCount AS INT) + 1 AS RoomId
    FROM sys.all_objects s1 WITH (NOLOCK)
        CROSS JOIN sys.all_objects s2 WITH (NOLOCK)

) s

ALTER TABLE ##RandomRooms
ADD Id INT IDENTITY(1,1)

SELECT
    'XYZ' AS PersonName,
    R1.RoomId AS LivingRoomId,
    R2.RoomId AS DiningRoomId
INTO ##RandomOccupants
FROM ##RandomRooms R1
    JOIN ##RandomRooms R2
        ON  R2.Id % 2 = 0
        AND R2.Id = R1.Id + 1

GO

PRINT CHAR(10) + 'Test 1: No integrity check'

CHECKPOINT;
DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;
SET NOCOUNT OFF
SET STATISTICS TIME ON

INSERT INTO Occupants
SELECT *
FROM ##RandomOccupants

SET STATISTICS TIME OFF
SET NOCOUNT ON

TRUNCATE TABLE Occupants

PRINT CHAR(10) + 'Test 2: Unique indexes and check constraint'

CREATE UNIQUE INDEX UQ_LivingRoomId
ON Occupants (LivingRoomId)

CREATE UNIQUE INDEX UQ_DiningRoomId
ON Occupants (DiningRoomId)

GO

CREATE FUNCTION CheckExclusiveRoom(@occupantId INT)
RETURNS BIT AS
BEGIN
RETURN 
(
    SELECT CASE WHEN EXISTS
    (
        SELECT *
        FROM Occupants O1
            JOIN Occupants O2
                ON O1.LivingRoomId = O2.DiningRoomId
             -- OR O1.DiningRoomId = O2.LivingRoomId
        WHERE O1.Id = @occupantId
    )
    THEN 0
    ELSE 1
    END
)
END

GO

ALTER TABLE Occupants
ADD CONSTRAINT ExclusiveRoom 
CHECK (dbo.CheckExclusiveRoom(Id) = 1)

CHECKPOINT;
DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;
SET NOCOUNT OFF
SET STATISTICS TIME ON

INSERT INTO Occupants
SELECT *
FROM ##RandomOccupants

SET STATISTICS TIME OFF
SET NOCOUNT ON

ALTER TABLE Occupants DROP CONSTRAINT ExclusiveRoom
DROP INDEX UQ_LivingRoomId ON Occupants
DROP INDEX UQ_DiningRoomId ON Occupants
DROP FUNCTION CheckExclusiveRoom

TRUNCATE TABLE Occupants

PRINT CHAR(10) + 'Test 3: Insert trigger'

CREATE TABLE RoomTaken 
(
    RoomId INT NOT NULL PRIMARY KEY REFERENCES Rooms (Id) 
)

GO

CREATE TRIGGER UpdateRoomTaken
ON Occupants
AFTER INSERT
AS 
    INSERT INTO RoomTaken
    SELECT RoomId
    FROM
    (
        SELECT LivingRoomId AS RoomId
        FROM INSERTED
            UNION ALL
        SELECT DiningRoomId AS RoomId
        FROM INSERTED
    ) s

GO  

CHECKPOINT;
DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;
SET NOCOUNT OFF
SET STATISTICS TIME ON

INSERT INTO Occupants
SELECT *
FROM ##RandomOccupants

SET STATISTICS TIME OFF
SET NOCOUNT ON

DROP TRIGGER UpdateRoomTaken
DROP TABLE RoomTaken

TRUNCATE TABLE Occupants

PRINT CHAR(10) + 'Test 4: Indexed view with unique index'

CREATE TABLE TwoRows
(
    Id INT NOT NULL PRIMARY KEY
)

INSERT INTO TwoRows VALUES (1), (2)

GO

CREATE VIEW OccupiedRooms
WITH SCHEMABINDING
AS
    SELECT RoomId = CASE R.Id WHEN 1 
                    THEN O.LivingRoomId 
                    ELSE O.DiningRoomId 
                    END
    FROM dbo.Occupants O
        CROSS JOIN dbo.TwoRows R

GO

CREATE UNIQUE CLUSTERED INDEX UQ_OccupiedRooms
ON OccupiedRooms (RoomId);

CHECKPOINT;
DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;
SET NOCOUNT OFF
SET STATISTICS TIME ON

INSERT INTO Occupants
SELECT *
FROM ##RandomOccupants

SET STATISTICS TIME OFF
SET NOCOUNT ON

DROP INDEX UQ_OccupiedRooms ON OccupiedRooms
DROP VIEW OccupiedRooms
DROP TABLE TwoRows

TRUNCATE TABLE Occupants

DROP TABLE ##RandomRooms
DROP TABLE ##RandomOccupants

DROP TABLE Occupants
DROP TABLE Rooms


/* Results:

Test 1: No integrity check

 SQL Server Execution Times:
   CPU time = 5210 ms,  elapsed time = 10853 ms.

(500000 row(s) affected)

Test 2: Unique indexes and check constraint

 SQL Server Execution Times:
   CPU time = 21996 ms,  elapsed time = 27019 ms.

(500000 row(s) affected)

Test 3: Insert trigger
SQL Server parse and compile time: 
   CPU time = 5663 ms, elapsed time = 11192 ms.

 SQL Server Execution Times:
   CPU time = 4914 ms,  elapsed time = 4913 ms.

(1000000 row(s) affected)

 SQL Server Execution Times:
   CPU time = 10577 ms,  elapsed time = 16105 ms.

(500000 row(s) affected)

Test 4: Indexed view with unique index

 SQL Server Execution Times:
   CPU time = 10171 ms,  elapsed time = 15777 ms.

(500000 row(s) affected)

*/

4 个答案:

答案 0 :(得分:10)

我认为唯一的方法是使用约束和函数。

伪代码(很长时间没有这样做):

CREATE FUNCTION CheckExlusiveRoom
RETURNS bit
declare @retval bit
set @retval = 0
    select retval = 1 
      from Occupants as Primary
      join Occupants as Secondary
        on Primary.LivingRoomId = Secondary.DiningRoomId
     where Primary.ID <> Secondary.ID
        or (   Primary.DiningRoomId= Secondary.DiningRoomId
            or Primary.LivingRoomId = Secondary.LivingRoomID)
return @retval
GO

然后,在检查约束中使用此函数....

替代方法是使用一个中间表OccupiedRoom,在那里你总是会插入使用的房间(例如通过触发器?)和FK而不是Room表

对评论的反应:

您是否需要直接在表上执行它,或者是否因为插入/更新足够而发生约束违规?因为那时我想是这样的:

  1. 创建一个简单的表:

    create table RoomTaken (RoomID int primary key references Room (Id) )
    
  2. 在插入/更新/删除时创建一个触发器,确保在Occupants中使用的任何Room都保存在RoomID中。

  3. 如果您尝试复制房间使用情况,RoomTaken表将抛出PK违规

  4. 不确定这是否足够和/或它与UDF的速度比较(我认为它会更好)。

    是的,我看到RoomTaken在使用者身上不会使用FK的问题,但是......实际上,你在一些限制下工作并且没有完美的解决方案 - 它的速度(UDF)vs我认为100%诚信执法。

答案 1 :(得分:6)

您可以以索引视图的形式创建“外部”约束:

CREATE VIEW dbo.OccupiedRooms
WITH SCHEMABINDING
AS
SELECT r.Id
FROM   dbo.Occupants AS o
INNER JOIN dbo.Rooms AS r ON r.Id IN (o.LivingRoomId, o.DiningRoomId)
;
GO

CREATE UNIQUE CLUSTERED INDEX UQ_1 ON dbo.OccupiedRooms (Id);

视图基本上是对已占用房间的ID进行拆分,将它们全部放在一列中。该列的唯一索引确保它没有重复项。

以下是此方法的工作原理演示:

<强>更新

作为hvd has correctly remarked,上述解决方案无法捕获将相同的LivingRoomIdDiningRoomId放在同一行时的尝试。这是因为dbo.Rooms表在这种情况下只匹配一次,因此,连接生成只为该对引用生成一行。

在同一注释中建议了一种解决方法:除索引视图外,在dbo.OccupiedRooms表上使用CHECK约束来禁止具有相同房间ID的行。但是,建议的LivingRoomId <> DiningRoomId条件对于两列都为NULL的情况不起作用。为了解释这种情况,条件可以扩展到这个:

LivingRoomId <> DinindRoomId AND (LivingRoomId IS NOT NULL OR DinindRoomId IS NOT NULL)

或者,您可以更改视图的SELECT语句以捕获所有情况。如果LivingRoomIdDinindRoomIdNOT NULL列,则可以避免加入dbo.Rooms并使用交叉连接将列拆分为虚拟2行表:

SELECT  Id = CASE x.r WHEN 1 THEN o.LivingRoomId ELSE o.DiningRoomId END
FROM    dbo.Occupants AS o
CROSS
JOIN    (SELECT 1 UNION ALL SELECT 2) AS x (r)

但是,由于这些列允许使用NULL,因此此方法不允许您插入多个单引用行。为了使它适用于您的情况,您需要过滤掉NULL条目,但前提是它们来自其他引用不为NULL的行。我相信在上面的查询中添加以下WHERE子句就足够了:

WHERE o.LivingRoomId IS NULL AND o.DinindRoomId IS NULL
   OR x.r = 1 AND o.LivingRoomId IS NOT NULL
   OR x.r = 2 AND o.DinindRoomId IS NOT NULL

答案 2 :(得分:1)

您可以向Occupants表添加检查约束:

CHECK (LivingRoomId <> DiningRoomId)

如果你也想处理NULL:

CHECK ((LivingRoomId <> DiningRoomId) or LivingRoomId is NULL or DiningRoomId is NULL)

答案 3 :(得分:-2)

您可以使用2个唯一约束来完成此操作。如果要允许多个NULL,请使用已过滤的索引,每个索引都包含WHERE ... NOT NULL。