自联接表并匹配多行

时间:2016-12-02 16:31:11

标签: c# sql sql-server ms-access subquery

考虑以下多对多表:

FK_Composition | FK_Part | Position | Quantity
---------------+---------+----------+---------
101            | 2001    | -3       | 1
101            | 2002    | -2       | 2
101            | 2003    | -1       | 1
101            | 2011    |  0       | 1
102            | 2001    | -2       | 1
102            | 2002    | -1       | 2
102            | 2003    |  0       | 1
102            | 2012    |  1       | 1

目标是通过比较来找到几乎与某种成分相同的成分。 "非常多"意思是:

  1. 除了位置最高的部分
  2. 之外,比较的成分由与原始部分相同的部分组成
  3. 所有部件必须位于相同的相对位置(比较组合与位置[-1,0时,将允许位置[-3,-2,-1,0], 1,2])
  4. 零件的每次出现都应具有相同的数量
  5. 比较的作品没有比原作
  6. 更多的部分

    按照这些规则,组合102 几乎与101相同。

    当然,我的问题更有趣:我们的实体框架只允许使用简单的SELECT查询我们应该支持两个SQL Server数据库以及Access"数据库"。查询应该支持系统和临时表,循环是不行的。

    我已经与不同的数据库合作了一段时间,但我不记得我以前必须匹配自连接中多行的多个值。然而,我认为有理由认为应该是实现这一目标的简单方法。有吗?

    奖金问题:我正在考虑在我们的.NET应用程序中单独查询我需要的数据,让Linq发挥其神奇作用,但有些人担心某些客户'计算机可能无法很好地处理过多的内存数据。我们正在谈论可能高达一百万行的数据,具体取决于客户端'数据库。这是一个有效的问题吗?

    编辑 - 根据评论中的要求,这里有一些反例,其中的构图与构图101相比不应该匹配:

    FK_Composition | FK_Part | Position | Quantity | Reason
    ---------------+---------+----------+----------+------
    151            | 2001    | -3       | 1        | Part 2004 is no match
    151            | 2002    | -2       | 2        | 
    151            | 2004    | -1       | 1        | 
    151            | 2011    |  0       | 1        | 
    152            | 2001    | -2       | 1        | Has a different number of parts
    152            | 2002    | -1       | 2        | 
    152            | 2012    |  0       | 1        | 
    153            | 2001    |  1       | 2        | Position 1 has the wrong quantity
    153            | 2002    |  2       | 2        | 
    153            | 2004    |  3       | 1        | 
    153            | 2011    |  4       | 1        | 
    

2 个答案:

答案 0 :(得分:0)

这是一个sql-server版本。这不适用于ms-access,老实说,所有子查询和派生表以及在msa-access中执行此操作所必需的东西都不会让我感兴趣。但它应该让你了解一些逻辑等。

http://rextester.com/FSJG1811显示正常工作的示例

DECLARE @Table AS TABLE (FKComposition INT, FKPart INT, Position INT, Quantity INT)
INSERT INTO @Table VALUES (101,2001,-3,1),(101,2002,-2,2),(101,2003,-1,1),(101,2011, 0,1)
,(102,2001,-2,1),(102,2002,-1,2),(102,2003, 0,1),(102,2012, 1,1) -- 3 of 4 match but not the highest position
,(103,2002,-2,1),(103,2003,-1,2),(103,2004, 0,1),(103,2012, 1,1) -- 3 of 4 match but not 1 of the middle positions
,(104,2001,-2,1) --match but last and highes position
,(105,2001,-3,1),(105,2002,-2,2),(105,2003,-1,1),(105,2011, 0,1) --exact match
,(106,2001,-5,1),(106,2002,-2,2),(106,2003,-1,1),(106,2011, 0,1) --exact match except PositionDifference

;WITH cte AS (
    SELECT
       *
       ,ROW_NUMBER() OVER (PARTITION BY FKComposition ORDER BY Position) as PosRowNum
       ,Position - COALESCE(LAG(Position) OVER (PARTITION BY FKComposition ORDER BY Position),Position) as PosDif
       ,COUNT(*) OVER (PARTITION BY FKcomposition) as PartsCount
       ,MAX(Position) OVER (PARTITION BY FKComposition) as HighestPosition
    FROM
       @Table
)

    SELECT
       CASE WHEN c1.FKComposition < c2.FKComposition THEN c1.FKComposition ELSE c2.FKComposition END as FKComposition
       ,CASE WHEN c1.FKComposition < c2.FKComposition THEN c2.FKComposition ELSE c1.FKComposition END as FKCompositionOfMatch
       ,c1.PartsCount
       ,COUNT(c2.FKComposition) / 2 as MatchedPartsCount
       ,COUNT(CASE WHEN c2.HighestPosition = c2.Position THEN c2.HighestPosition END) as MatchIncludesHighest
    FROM       
       cte c1
       INNER JOIN cte c2
       ON c1.FKComposition <> c2.FKComposition
       AND c1.FKPart = c2.FKPart
       AND c1.Quantity = c2.Quantity
       AND c1.PosRowNum = c2.PosRowNum
       AND c1.PosDif = c2.PosDif
    GROUP BY
       CASE WHEN c1.FKComposition < c2.FKComposition THEN c1.FKComposition ELSE c2.FKComposition END
       ,CASE WHEN c1.FKComposition < c2.FKComposition THEN c2.FKComposition ELSE c1.FKComposition END
       ,c1.PartsCount
       ,c2.PartsCount
    HAVING
       c1.PartsCount = c2.PartsCount
       AND (
          COUNT(c2.FKComposition) = (c1.PartsCount * 2) --there are 2 combinations 101 to 105 and 105 to 101 so must double the count
          OR (COUNT(c2.FKComposition) >= (c1.PartsCount * 2) - 2)
             AND COUNT(CASE WHEN c2.HighestPosition = c2.Position THEN c2.HighestPosition END) = 0
          )

您会注意到还有几个测试用例。最终101应匹配102&amp; 105和102将匹配105(和101代理)我简化为1组合而不是重复,如101匹配102和102匹配101你将只获得101匹配102.

答案 1 :(得分:0)

我写了下面的内容,它将找到公园的相对位置......你可以在where where子句中使用。对不起,我不能再花时间来帮忙了。

SELECT STUFF((SELECT ',' + CAST(ISNULL(main.Position - sub.Position,0) AS VARCHAR(100)) AS ReletivePositions FROM
(SELECT *, ROW_NUMBER ( ) OVER (ORDER BY FK_Part DESC)  as RowNumber
FROM @PARTS P1
WHERE FK_Composition = 101 /* << CHANGE THIS TO FK_Composition IN QUERY */ ) main
LEFT JOIN (SELECT *, ROW_NUMBER ( ) OVER (ORDER BY FK_Part DESC)  as RowNumber
FROM @PARTS WHERE FK_Composition =  101  /* << CHANGE THIS TO FK_Composition IN QUERY */ ) sub ON main.RowNumber + 1 = sub.RowNumber
ORDER BY main.FK_Part DESC
FOR XML PATH(''), TYPE 
).value('.', 'NVARCHAR(MAX)'),1,1,'') AS ReletivePositions