如何消除重复行,其中重复项由两列之间的转置值定义

时间:2016-06-09 18:54:31

标签: tsql

我想要消除任何行,其中列A的值出现在列B中,而列B的值出现在列A中。

例如

id | column_A | column_B
------------------------
1  | quick    | brown
2  | quick    | fox
3  | brown    | quick
4  | lazy     | dog  
5  | fox      | quick

我正在尝试获得结果集

id | column_A | column_B
------------------------
1  | quick    | brown
2  | quick    | fox
4  | lazy     | dog

正如你可以看到带有ids 3&的行。因为在id = 3的行中,column_a = brown和column_b = quick的值匹配id = 1的转置值,其中column_a = quick而column_b = brown,则消除了图5中的行。与id = 2的行类似,消除了id = 5的行。

1 个答案:

答案 0 :(得分:0)

DECLARE @Tx TABLE (
     ID         INT IDENTITY
    ,column_A   NVARCHAR(20)
    ,column_B   NVARCHAR(20)
    )

INSERT INTO @Tx VALUES
     ('quick','brown')
    ,('quick','fox')
    ,('brown','quick')
    ,('lazy','dog')
    ,('fox','quick')

;WITH   RN
    AS (
        SELECT ID,
            CASE WHEN column_A < column_B THEN column_A + column_B
                 ELSE column_B + column_A END AS RNx
            FROM @Tx
        ),

        RO
    AS (
        SELECT ID, RNx, ROW_NUMBER() OVER (PARTITION BY RNx ORDER BY ID) AS RON
            FROM RN
        )

DELETE Tx
    FROM @Tx Tx
        LEFT JOIN RO
            ON Tx.ID = RO.ID AND RO.RON > 1
                WHERE RO.ID IS NOT NULL

SELECT * FROM @Tx