我想编写一个T-SQL函数,可以测试表中是否存在重复的行集,其中某些列会被比较,有些列会被忽略。
例如,请考虑以下数据集:
BomID PartNumber ItemNumber Quantity UnitID
4164 10004001 10001419 1 33
4169 10004001 103599 1 33
4171 10004001 103601 1 33
4163 10004001 10001329 10 33
4166 10004001 101823 8 33
10794 10012161 10001419 1 33
10799 10012161 103599 1 33
10801 10012161 103601 1 33
10793 10012161 10001329 10 33
10796 10012161 101823 8 33
我想写一个函数Bom.f_GetPartsThatHaveAnIdenticalBom(partNumber)
,当传递 10004001 时,可以有效地检测到 10012161 有重复的记录,通过比较元组{{1 }}。关键字段 BomID 将被忽略。因此,该函数将返回具有相同BOM的零件号(如果有)的不同列表。
我已经使用各种技术手动完成了这项操作。但由于我似乎更频繁地需要这个例程,我希望有一个基于集合,高效的函数,并且可以与LINQ to Entities查询中的其他表组合。
答案 0 :(得分:2)
以下查询使用full outer join
来比较这两组。任何不匹配的记录都会在一侧或另一侧产生NULL值。 having
子句中的比较会过滤掉这些内容。
SELECT b1.PartNumber, b2.PartNumber AS TargetPartNumber
FROM bom b full outer join
bom b2
ON b1.ItemNumber = b2.ItemNumber AND
b1.Quantity = b2.Quantity and
b1.UnitID = b2.UnitID and
b1.PartNumber <> b2.PartNumber
WHERE b1.PartNumber = @PartNumber
GROUP BY b1.PartNumber, b2.PartNumber
having count(*) = count(b1.PartNumber) and
count(*) = count(b2.PartNumber)
您可以通过索引(itemnumber,quantity,unitid,partnumber)来提高效率。
答案 1 :(得分:1)
这是一个可能适合您的SQL语句。
DECLARE @PartNumber int = 10004001
SELECT DISTINCT bom2.TargetPartNumber
FROM
(
SELECT PartNumber, COUNT(*) AS ItemCount
FROM bom
WHERE PartNumber = @PartNumber
GROUP BY PartNumber
) AS bom1
JOIN
(
SELECT b1.PartNumber, b2.PartNumber AS TargetPartNumber, COUNT(*) AS ItemCount
FROM bom b1
JOIN bom b2 ON b1.ItemNumber = b2.ItemNumber
AND b1.Quantity = b2.Quantity
AND b1.UnitID = b2.UnitID
AND b1.PartNumber <> b2.PartNumber
WHERE b1.PartNumber = @PartNumber
GROUP BY b1.PartNumber, b2.PartNumber
) AS bom2 ON bom1.PartNumber = bom2.PartNumber
AND bom1.ItemCount = bom2.ItemCount
WHERE bom1.ItemCount = (SELECT COUNT(*) FROM bom WHERE PartNumber = bom2.TargetPartNumber)
ORDER BY bom2.TargetPartNumber
您可以将其放入存储过程或函数中。 @PartNumber
表示您传递给函数的值。
答案 2 :(得分:1)
以下是基于bobs答案的修改版本的完整解决方案:
DECLARE @PartNumber AS udt_PartNumber; SET @PartNumber = N'10012163';
SELECT DISTINCT bom2.TargetPartNumber
FROM
(
SELECT PartNumber, COUNT(*) AS ItemCount
FROM Part.BillsOfMaterials
WHERE PartNumber = @PartNumber
GROUP BY PartNumber
) AS bom1
JOIN
(
SELECT b1.PartNumber, b2.PartNumber AS TargetPartNumber, COUNT(*) AS ItemCount
FROM Part.BillsOfMaterials b1
RIGHT JOIN Part.BillsOfMaterials b2 ON b1.ItemNumber = b2.ItemNumber
AND b1.Quantity = b2.Quantity
AND b1.UnitID = b2.UnitID
AND b1.PartNumber <> b2.PartNumber
WHERE b1.PartNumber = @PartNumber
GROUP BY b1.PartNumber, b2.PartNumber
) AS bom2 ON bom1.PartNumber = bom2.PartNumber
AND bom1.ItemCount = bom2.ItemCount
WHERE bom1.ItemCount = (SELECT COUNT(*) FROM Part.BillsOfMaterials WHERE PartNumber = bom2.TargetPartNumber)
ORDER BY bom2.TargetPartNumber
唯一的区别是最终的WHERE子句,如果目标包含源部件号的BOM中不存在的额外行,则确保找不到匹配。