我正在尝试从表中识别重复项列表,我的表格如下所示:
列1-列2
5-4
第2列真的是一个varchar列,但是为了简单起见,我把所有数字都用掉了。
我一直在玩CheckSum_Agg,但它有误报。 :(
我的输出看起来像这样:
我选择第一列的最小ID和第二列的所有其他值。省略重复。
另一个例子可能如下:
我正在使用SQL Server 2012.谢谢!
答案 0 :(得分:0)
WITH t AS (
SELECT
column1,
COUNT(*) c
FROM MyTable
GROUP BY column1
)
SELECT
t1.column1,
t2.column1
FROM t t1
INNER JOIN t t2 ON (
t1.c = t2.c AND
t2.column1 > t1.column1
)
WHERE NOT EXISTS (
SELECT column2 FROM MyTable WHERE column1 = t1.column1
EXCEPT
SELECT column2 FROM MyTable WHERE column1 = t2.column1
)
答案 1 :(得分:0)
select column1,column2 from my_table
group by column1,column2
having COUNT(*) > 1
将为您提供重复记录列表。
答案 2 :(得分:0)
--This code produced the results I was looking for in the original post.
WITH t AS (
SELECT
column1,
COUNT(*) c
FROM #tbl
GROUP BY column1
),
tt AS(
SELECT
t1.column1 as 'winner',
t2.column1 as 'loser'
FROM t t1
INNER JOIN t t2 ON (
t1.c = t2.c AND
t1.column1 < t2.column1
)
WHERE NOT EXISTS (
SELECT column2 FROM #tbl WHERE column1 = t1.column1
EXCEPT
SELECT column2 FROM #tbl WHERE column1 = t2.column1
)
)
SELECT fullList.winner, fullList.loser
FROM
( SELECT winner FROM tt tt1
EXCEPT
SELECT loser FROM tt tt2
) winnerList
JOIN tt fullList on winnerList.winner = fullList.winner
ORDER BY fullList.winner, fullList.loser