在比较指定列时查找相同行集的函数

时间:2013-01-11 00:25:06

标签: sql tsql

我想编写一个T-SQL函数,可以测试表中是否存在重复的行集,其中某些列会被比较,有些列会被忽略。

例如,请考虑以下数据集:

BomID   PartNumber  ItemNumber  Quantity    UnitID
4164    10004001    10001419        1         33
4169    10004001    103599          1         33
4171    10004001    103601          1         33
4163    10004001    10001329       10         33
4166    10004001    101823          8         33
10794   10012161    10001419        1         33
10799   10012161    103599          1         33
10801   10012161    103601          1         33
10793   10012161    10001329       10         33
10796   10012161    101823          8         33

我想写一个函数Bom.f_GetPartsThatHaveAnIdenticalBom(partNumber),当传递 10004001 时,可以有效地检测到 10012161 有重复的记录,通过比较元组{{1 }}。关键字段 BomID 将被忽略。因此,该函数将返回具有相同BOM的零件号(如果有)的不同列表。

我已经使用各种技术手动完成了这项操作。但由于我似乎更频繁地需要这个例程,我希望有一个基于集合,高效的函数,并且可以与LINQ to Entities查询中的其他表组合。

3 个答案:

答案 0 :(得分:2)

以下查询使用full outer join来比较这两组。任何不匹配的记录都会在一侧或另一侧产生NULL值。 having子句中的比较会过滤掉这些内容。

SELECT b1.PartNumber, b2.PartNumber AS TargetPartNumber
FROM bom b full outer join
     bom b2
     ON b1.ItemNumber = b2.ItemNumber AND
        b1.Quantity = b2.Quantity and
        b1.UnitID = b2.UnitID and
        b1.PartNumber <> b2.PartNumber
WHERE b1.PartNumber = @PartNumber
GROUP BY b1.PartNumber, b2.PartNumber
having count(*) = count(b1.PartNumber) and
       count(*) = count(b2.PartNumber)

您可以通过索引(itemnumber,quantity,unitid,partnumber)来提高效率。

答案 1 :(得分:1)

这是一个可能适合您的SQL语句。

DECLARE @PartNumber int = 10004001

SELECT DISTINCT bom2.TargetPartNumber
FROM
    (
    SELECT PartNumber, COUNT(*) AS ItemCount
    FROM bom
    WHERE PartNumber = @PartNumber
    GROUP BY PartNumber
    ) AS bom1
JOIN
    (
    SELECT b1.PartNumber, b2.PartNumber AS TargetPartNumber, COUNT(*) AS ItemCount
    FROM bom b1
    JOIN bom b2 ON b1.ItemNumber = b2.ItemNumber
                AND b1.Quantity = b2.Quantity
                AND b1.UnitID = b2.UnitID
                AND b1.PartNumber <> b2.PartNumber
    WHERE b1.PartNumber = @PartNumber
    GROUP BY b1.PartNumber, b2.PartNumber
    ) AS bom2 ON bom1.PartNumber = bom2.PartNumber
                AND bom1.ItemCount = bom2.ItemCount
WHERE bom1.ItemCount = (SELECT COUNT(*) FROM bom WHERE PartNumber = bom2.TargetPartNumber)
ORDER BY bom2.TargetPartNumber

您可以将其放入存储过程或函数中。 @PartNumber表示您传递给函数的值。

答案 2 :(得分:1)

以下是基于bobs答案的修改版本的完整解决方案:

DECLARE @PartNumber AS udt_PartNumber; SET @PartNumber = N'10012163';

SELECT DISTINCT bom2.TargetPartNumber
FROM
    (
    SELECT PartNumber, COUNT(*) AS ItemCount
    FROM Part.BillsOfMaterials
    WHERE PartNumber = @PartNumber
    GROUP BY PartNumber
    ) AS bom1
JOIN
    (
    SELECT b1.PartNumber, b2.PartNumber AS TargetPartNumber, COUNT(*) AS ItemCount
    FROM Part.BillsOfMaterials b1
    RIGHT JOIN Part.BillsOfMaterials b2 ON b1.ItemNumber = b2.ItemNumber
                AND b1.Quantity = b2.Quantity
                AND b1.UnitID = b2.UnitID
                AND b1.PartNumber <> b2.PartNumber
    WHERE b1.PartNumber = @PartNumber
    GROUP BY b1.PartNumber, b2.PartNumber
    ) AS bom2 ON bom1.PartNumber = bom2.PartNumber
                AND bom1.ItemCount = bom2.ItemCount
WHERE bom1.ItemCount = (SELECT COUNT(*) FROM Part.BillsOfMaterials WHERE PartNumber = bom2.TargetPartNumber)
ORDER BY bom2.TargetPartNumber

唯一的区别是最终的WHERE子句,如果目标包含源部件号的BOM中不存在的额外行,则确保找不到匹配。