我有一个邮件列表,有许多条目是重复的。我试图找到重复项,以便删除它们。当我运行下面的完整查询时,我得到表中的所有项目(142,000 +)。当我运行子查询时,我只得到5768行。我需要每个重复行的所有列来决定删除哪一行。我做错了什么导致完整查询返回所有行?
select * from Mailinglist
where exists
(select count(*), mailaddress, CenterName
from Mailinglist
group by MailAddress, CenterName
having count(*)>1)
答案 0 :(得分:3)
你必须这样做:
select t1.*, t2.cnt
from Mailinglist t1
join (
select count(*) as cnt, mailaddress, CenterName
from Mailinglist
group by MailAddress, CenterName
having count(*)>1
) t2 ON t1.MailAddress = t2.MailAddress and t1.CenterName = t2.CenterName
使用EXISTS
只检查记录是否存在:如果子查询返回一个或多个记录,则EXISTS
计算为true
。
答案 1 :(得分:2)
EXISTS
返回true。
您正在寻找的是
select * from Mailinglist
where mailaddress IN
(
select mailaddress
from Mailinglist
group by MailAddress, CenterName
having count(*)>1
)
答案 2 :(得分:0)
这是因为如果子查询返回一行,EXISTS
将返回true。你的子查询,
返回一行或多行,从而为TRUE
条件返回EXISTS
。
要获得带有重复项的MailingList
,您只需运行子查询:
SELECT
COUNT(*),
mailaddress,
CenterName
FROM Mailinglist
GROUP BY
MailAddress, CenterName
HAVING COUNT(*) > 1
要删除重复项,您可以使用ROW_NUMBER
:
WITH Cte AS(
SELECT *,
rn = ROW_NUMBER() OVER(PARTITION BY MailAddress, Centername ORDER BY (SELECT NULL))
FROM MailingList
)
DELETE FROM Cte WHERE rn > 1
只需替换ORDER BY
子句,具体取决于您要保留的重复项的哪一行。
答案 3 :(得分:0)
子查询中没有过滤(在哪里),所以它总会产生一些结果。
SELECT *
FROM Mailinglist AS ML
WHERE EXISTS
(SELECT COUNT(*) AS Expr1, mailaddress, CenterName
FROM Mailinglist AS CNT
WHERE (ML.MailAddress = CNT.MailAddress) AND (ML.CenterName = CNT.CenterName)
GROUP BY mailaddress, CenterName
HAVING (COUNT(*) > 1))