将记录添加到数据中存在共性的表中

时间:2017-11-03 18:13:42

标签: sql sql-server sql-server-2008

我有一张表,其中包含已购买商品和更换商品的清单。如果同一买方的购买项目的替换项目存在共性,我想确定表格中当前不存在的可能匹配对。

例如,该表可能包含以下数据:

+----------+---------+----------
|  Buyer_ID| Buy_ID  |Rplc_ID  |
+----------+---------+----------
| 12345    |   A     |    C    |
| 12345    |   A     |    D    |
| 12345    |   B     |    C    |
+----------+---------+----------

然而,因为Buy_IDs A和B都有共同的C,我想找出B中缺少的B ...

+----------+---------+----------
|  Buyer_ID| Buy_ID  |Rplc_ID  |
+----------+---------+----------
| 12345    |   A     |    C    |
| 12345    |   A     |    D    |
| 12345    |   B     |    C    |
| 12345    |   B     |    D    |
+----------+---------+----------

这几乎就像Ven图,因为我需要确定不同Buy_IDs中Rplc_Ids重叠的位置。然后我需要消除这两者之间的不匹配,因此Rplc_ID之间的任何重叠都会强制所有Buy_ID与任何共性之间的共性。

Ven1

Ven2

这是一个相当温和的例子。我正在努力的是可能发生的各种程度。例如,如果我有10个Buy_Ids,每个都有10个Rplc_ID,并且Buy_ID对存在共性,我该如何以编程方式查询它?

我尝试了各种使用动态SQL和循环的技术来识别并将记录附加到表中以连续添加到A / B重叠,但这种方法成效有限。我想象一个CTE可以做到这一点,但我不能制定一个查询,而不能有效地做到这一点。

我在这里提供了一个更详细的例子。请注意,第一个@Replacements表包含初始数据。 @Results表显示了我希望返回的内容。

DECLARE @Replacements TABLE
(
 Buyer_ID       VARCHAR(10),
 BUY_ID         VARCHAR(10),
 RPL_ID         VARCHAR(10)

 )

DECLARE @Results TABLE
(
 Buyer_ID       VARCHAR(10),
 BUY_ID         VARCHAR(10),
 RPL_ID         VARCHAR(10)

 )

Insert into @Replacements 
VALUES
('10003','A','D'),
('10003','A','E'),
('10003','A','F'),
('10003','A','B'),
('10003','A','C'),
('10003','B','B'),
('10003','C','D'),
('10003','C','E'),
('10003','C','F'),
('10003','C','B')


Insert into @Results 
VALUES
('10003','A','D'),
('10003','A','E'),
('10003','A','F'),
('10003','A','B'),
('10003','A','C'),
('10003','B','B'),
('10003','C','D'),
('10003','C','E'),
('10003','C','F'),
('10003','C','B'),
('10003','B','D'),
('10003','B','E'),
('10003','B','F'),
('10003','B','C'),
('10003','C','C')

Select * from @Replacements ORDER BY RPL_ID, BUY_ID
Select * from @Results ORDER BY RPL_ID, BUY_ID

结果如下: @Replacements:

Buyer_ID    RPL_ID  BUY_ID
    10003   A        B
    10003   A        C
    10003   A        D
    10003   A        E
    10003   A        F
    10003   B        B
    10003   C        B
    10003   C        D
    10003   C        E
    10003   C        F

预期结果@Results

Buyer_ID    RPL_ID  BUY_ID
10003   A        B
10003   A        C
10003   A        D
10003   A        E
10003   A        F
10003   B        B
10003   B        C
10003   B        D
10003   B        E
10003   B        F
10003   C        B
10003   C        C
10003   C        D
10003   C        E
10003   C        F

通过这个步骤考虑,Buyer_ID A和C共同具有Rplc_ID B,D,E,F。因为A有C,所以Buyer_ID C也必须有C.

因为Buyer_ID A,B,C具有共同的Rplc_ID B,所以Buyer_ID B必须接收Buyer_ID A和C具有的RPLC_ID,因此Buyer_ID B被赋予Rplc_ID C,D,E,F。

有什么建议吗?

3 个答案:

答案 0 :(得分:1)

这是否能满足您的需求:

SELECT DISTINCT
    R.Buyer_ID
    , R.BUY_ID
    , R2.RPL_ID
FROM
    @Replacements R
    JOIN @Replacements R1 ON
        R.Buyer_ID = R1.Buyer_ID
        AND R.RPL_ID = R1.RPL_ID
    JOIN @Replacements R2 ON
        R.Buyer_ID = R2.Buyer_ID
        AND R1.BUY_ID = R2.BUY_ID

答案 1 :(得分:1)

试试这个:

select distinct
       r.Buyer_ID
     , r.BUY_ID
     , r2.RPL_ID
  from @Replacements r
  join @Replacements r1
    on R1.RPL_ID = R.RPL_ID
  join @Replacements r2
    on r2.BUY_ID = r1.BUY_ID
 order by 1, 2, 3

答案 2 :(得分:0)

感谢上面的帮助,这让我99%的路程。但是我发现在某些情况下必须重复这个匹配过程。

这是初始数据加载:

DECLARE @Replacements TABLE
(
 Buyer_ID       VARCHAR(10),
 BUY_ID         VARCHAR(10),
 RPL_ID         VARCHAR(10)

 )

DECLARE @Results TABLE
(
 Buyer_ID       VARCHAR(10),
 BUY_ID         VARCHAR(10),
 RPL_ID         VARCHAR(10)

 )

Insert into @Replacements 
VALUES
('10003','A','D'),
('10003','A','E'),
('10003','A','F'),
('10003','A','B'),
('10003','A','C'),
('10003','B','B'),
('10003','C','D'),
('10003','C','E'),
('10003','C','F'),
('10003','C','B'),

('10004','A','S'),
('10004','A','T'),
('10004','A','U'),
('10004','A','Z'),
('10004','B','S'),
('10004','B','T'),
('10004','B','U'),
('10004','B','V'),
('10004','B','W'),
('10004','C','V'),
('10004','C','W'),
('10004','C','X'),
('10004','D','X'),
('10004','D','Y'),
('10004','D','Z'),
('10004','E','P'),
('10004','E','Q'),
('10004','E','R'),
('10004','F','P'),
('10004','F','Q'),
('10004','F','R')




Insert into @Results 
VALUES
('10003','A','D'),
('10003','A','E'),
('10003','A','F'),
('10003','A','B'),
('10003','A','C'),
('10003','B','B'),
('10003','C','D'),
('10003','C','E'),
('10003','C','F'),
('10003','C','B'),
('10003','B','D'),
('10003','B','E'),
('10003','B','F'),
('10003','B','C'),
('10003','C','C'),

('10004','A','S'),
('10004','A','T'),
('10004','A','U'),
('10004','A','V'),
('10004','A','W'),
('10004','A','X'),
('10004','A','Y'),
('10004','A','Z'),
('10004','B','S'),
('10004','B','T'),
('10004','B','U'),
('10004','B','V'),
('10004','B','W'),
('10004','B','X'),
('10004','B','Y'),
('10004','B','Z'),
('10004','C','T'),
('10004','C','U'),
('10004','C','V'),
('10004','C','W'),
('10004','C','X'),
('10004','C','Y'),
('10004','C','Z'),
('10004','D','S'),
('10004','D','T'),
('10004','D','U'),
('10004','D','V'),
('10004','D','W'),
('10004','D','X'),
('10004','D','Y'),
('10004','D','Z'),
('10004','E','P'),
('10004','E','Q'),
('10004','E','R'),
('10004','F','P'),
('10004','F','Q'),
('10004','F','R')

如果我要运行以下SQL脚本:

SELECT DISTINCT
    R.Buyer_ID
    , R.BUY_ID
    , R2.RPL_ID
FROM
    @Replacements R
    JOIN @Replacements R1 ON
        R.Buyer_ID = R1.Buyer_ID
        AND R.RPL_ID = R1.RPL_ID
    JOIN @Replacements R2 ON
        R.Buyer_ID = R2.Buyer_ID
        AND R1.BUY_ID = R2.BUY_ID

并将结果与​​我期望的结果进行比较:

 Select RSLT.* 
 FROM @Results RSLT
 LEFT JOIN 
    (SELECT DISTINCT
        R.Buyer_ID
        , R.BUY_ID
        , R2.RPL_ID
    FROM
        @Replacements R
        JOIN @Replacements R1 ON
            R.Buyer_ID = R1.Buyer_ID
            AND R.RPL_ID = R1.RPL_ID
        JOIN @Replacements R2 ON
            R.Buyer_ID = R2.Buyer_ID
            AND R1.BUY_ID = R2.BUY_ID)RPLC
  ON RSLT.Buyer_ID=RPLC.Buyer_ID
    and RSLT.BUY_ID=RPLC.BUY_ID
    and RSLT.RPL_ID=RPLC.RPL_ID
where RPLC.Buyer_ID IS NULL

我留下了一条缺失记录:

Buyer_ID    BUY_ID  RPL_ID
10004        B       Y

我发现如果我将表达式设为CTE并自我加入,我最终会得到结果。坦率地说,我不确定这不仅仅是通过将查询包装在CTE中而有效运行两次查询的结果。

;with cRSLT as (
SELECT DISTINCT
    R.Buyer_ID
    , R.BUY_ID
    , R2.RPL_ID
FROM
    @Replacements R
    JOIN @Replacements R1 ON
        R.Buyer_ID = R1.Buyer_ID
        AND R.RPL_ID = R1.RPL_ID
    JOIN @Replacements R2 ON
        R.Buyer_ID = R2.Buyer_ID
        AND R1.BUY_ID = R2.BUY_ID )

Select DISTINCT
    r.Buyer_ID,
    r.BUY_ID,
    r2.RPL_ID
from cRSLT r
  JOIN cRSLT r1 
    ON r.Buyer_ID=r1.Buyer_ID
     and r.RPL_ID=r1.RPL_ID
  JOIN cRSLT r2
    ON r.Buyer_ID=r2.Buyer_ID
     and r1.BUY_ID=r2.BUY_ID
ORDER BY r.Buyer_ID, r.BUY_ID, r2.RPL_ID