我有一组表(有几个一对多的关系),形成一个单位"单位"。我需要确保清除重复项,但确定重复项需要考虑所有数据。
更糟糕的是,有问题的数据库仍处于Sql 2000兼容模式,因此无法使用任何新功能。
Create Table UnitType
(
Id int IDENTITY Primary Key,
Action int not null,
TriggerType varchar(25) not null
)
Create Table Unit
(
Id int IDENTITY Primary Key,
TypeId int Not Null,
Message varchar(100),
Constraint FK_Unit_Type Foreign Key (TypeId) References UnitType(Id)
)
Create Table Item
(
Id int IDENTITY Primary Key,
QuestionId int not null,
Sequence int not null
)
Create Table UnitCondition
(
Id int IDENTITY Primary Key,
UnitId int not null,
Value varchar(10),
ItemId int not null
Constraint FK_UnitCondition_Unit Foreign Key (UnitId) References Unit(Id),
Constraint FK_UnitCondition_Item Foreign Key (ItemId) References Item(Id)
)
Insert into Item (QuestionId, Sequence)
Values (1, 1),
(1, 2)
Insert into UnitType(Action, TriggerType)
Values (1, 'Changed')
Insert into Unit (TypeId, Message)
Values (1, 'Hello World'),
(1, 'Hello World')
Insert into UnitCondition(UnitId, Value, ItemId)
Values (1, 'Test', 1),
(1, 'Hello', 2),
(2, 'Test', 1),
(2, 'Hello', 2)
我创建了SqlFiddle,展示了此问题的简单形式。
单元被视为与单元上的所有(非Id)字段重复,并且单元组合上的所有条件在每个细节中都完全匹配。考虑到它像Xml - 如果没有其他Unit
节点是精确的字符串副本,则Unit
节点(包含单元信息和条件子集合)是唯一的
Select
Action,
TriggerType,
U.TypeId,
U.Message,
(
Select C.Value, C.ItemId, I.QuestionId, I.Sequence
From UnitCondition C
Inner Join Item I on C.ItemId = I.Id
Where C.UnitId = U.Id
For XML RAW('Condition')
) as Conditions
from UnitType T
Inner Join Unit U on T.Id = U.TypeId
For XML RAW ('Unit'), ELEMENTS
但我遇到的问题是,我似乎无法让每个单元的XML显示为新记录,而且我不确定如何比较单位节点以查找重复项。
如何运行此查询以确定集合中是否存在重复的Xml Unit
节点?
答案 0 :(得分:0)
如果要确定记录是否重复,则无需将所有值组合成一个字符串。您可以使用ROW_NUMBER函数执行此操作:
SELECT
Action,
TriggerType,
U.Id,
U.TypeId,
U.Message,
C.Value,
I.QuestionId,
I.Sequence,
ROW_NUMBER () OVER (PARTITION BY <LIST OF FIELD THAT SHOULD BE UNIQUE>
ORDER BY <LIST OF FIELDS>) as DupeNumber
FROM UnitType T
Inner Join Unit U on T.Id = U.TypeId
Inner Join UnitCondition C on U.Id = C.UnitId
Inner Join Item I on C.ItemId = I.Id;
如果DupeNumber大于1,则记录id重复。
答案 1 :(得分:0)
select u1.id, u2.id
from unit as u1
join unit as u2
on ui.ID < u2.id
join UnitCondition uc1
on uc1.unitID = u1.ID
full outer join uc2
on uc2.unitID = u2.ID
and uc2.itemID = uc1.itemID
where uc2.itemID is null or uc1.itemID is null
答案 2 :(得分:0)
所以,我设法弄清楚我需要做什么。虽然它有点笨拙。
首先,您需要将Xml Select
语句包装在Unit表中的另一个select中,以确保我们最终得到的xml仅代表该单元。
Select
Id,
(
Select
Action,
TriggerType,
IU.TypeId,
IU.Message,
(
Select C.Value, I.QuestionId, I.Sequence
From UnitCondition C
Inner Join Item I on C.ItemId = I.Id
Where C.UnitId = IU.Id
Order by C.Value, I.QuestionId, I.Sequence
For XML RAW('Condition'), TYPE
) as Conditions
from UnitType T
Inner Join Unit IU on T.Id = IU.TypeId
WHERE IU.Id = U.Id
For XML RAW ('Unit')
)
From Unit U
然后,您可以将其包装在另一个选择中,按内容对xml进行分组。
Select content, count(*) as cnt
From
(
Select
Id,
(
Select
Action,
TriggerType,
IU.TypeId,
IU.Message,
(
Select C.Value, C.ItemId, I.QuestionId, I.Sequence
From UnitCondition C
Inner Join Item I on C.ItemId = I.Id
Where C.UnitId = IU.Id
Order by C.Value, I.QuestionId, I.Sequence
For XML RAW('Condition'), TYPE
) as Conditions
from UnitType T
Inner Join Unit IU on T.Id = IU.TypeId
WHERE IU.Id = U.Id
For XML RAW ('Unit')
) as content
From Unit U
) as data
group by content
having count(*) > 1
这将允许您将整个内容相同的整个单元分组。
值得注意的是,要测试&#34; uniqueness&#34;,您需要保证内部Xml选择上的数据始终相同。为此,您应对相关数据(即xml中的数据)应用排序以确保一致性。你申请的顺序并不重要,只要两个相同的集合以相同的顺序输出。