我有一张表格,其中包含工人的可用性状态。结构如下:
CREATE TABLE [dbo].[Availability]
(
[OID] BIGINT IDENTITY (1, 1) NOT NULL,
[LocumID] BIGINT NOT NULL,
[AvailableDate] SMALLDATETIME NOT NULL,
[AvailabilityStatusID] INT NOT NULL,
[LastModifiedAt] TIMESTAMP NOT NULL,
CONSTRAINT [PK_Availability] PRIMARY KEY CLUSTERED ([OID] ASC) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY];
结果如下:
OID LocumID AvailableDate AvailabilityStatusID LastModifiedAt
-------------------- -------------------- ----------------------- -------------------- ------------------
1 1 2009-03-02 00:00:00 1 0x0000000000201A8C
2 2 2009-03-04 00:00:00 1 0x0000000000201A8D
3 1 2009-03-05 00:00:00 1 0x0000000000201A8E
4 1 2009-03-06 00:00:00 1 0x0000000000201A8F
5 2 2009-03-07 00:00:00 1 0x0000000000201A90
6 7 2009-03-09 00:00:00 1 0x0000000000201A91
7 1 2009-03-11 00:00:00 1 0x0000000000201A92
8 1 2009-03-12 00:00:00 2 0x0000000000201A93
9 1 2009-03-14 00:00:00 1 0x0000000000201A94
10 1 2009-03-16 00:00:00 1 0x0000000000201A95
现在,该表有超过3mil的记录,我注意到我的数据存在不一致。我需要以某种方式找到任何[AvailableDate]
的行,[LocumID]
(不管有多少)必须是唯一的。所以,基本上,一个工人可以在一个日期拥有其中一个[AvailabilityStatusID] = 1, 2, 3, or 4
。但是,在此表中,如果工作人员对[AvailableDate]
具有相同[AvailabilityStatusID]
或不同[AvailabilityStatusID]
如何检测这些记录?
问候。
答案 0 :(得分:2)
WITH x AS
(
SELECT LocumID, dt = AvailableDate
FROM dbo.Availability
GROUP BY LocumID, AvailableDate
HAVING COUNT(*) > 1
)
SELECT a.OID, a.LocumID, a.AvailableDate,
a.AvailabilityStatusID, a.LastModifiedAt
FROM x
INNER JOIN dbo.Availability AS a
ON x.LocumID = a.LocumID
AND x.dt = a.AvailableDate
ORDER BY a.LocumID, a.AvailableDate;
一旦你清理了这些数据(不确定你的规则将保留哪些行),你应该考虑(LocumID,AvailableDate)的唯一约束。以下是创建约束的方法(尽管在删除重复项之后将无法创建约束):
ALTER TABLE dbo.Availability
ADD CONSTRAINT uq_l_ad
UNIQUE (LocumID, AvailableDate);
当然,现在您将有新的错误返回到您的应用程序(Msg 2627),因为您的当前代码显然尚未检查是否已存在LocumID / AvailabilityDate组合,然后再添加新的错误。