MySQL - 选择性地删除重复的记录

时间:2016-08-04 05:21:16

标签: php mysql duplicates

我有几千个联系人的数据库,并希望删除所有重复的记录。我目前的SQL查询效果很好(在记录中 - tel 电子邮件 name1 重复)。查询删除具有较低ID但最后发生记录的重复项。但在某些情况下,记录的其他字段已经填写(重要的是标题 name2 )。我想要实现的是mysql检查这些字段是否已填写并仅保留记录了大部分信息的记录。

我的查询

<?php

$del_duplicate_contacts = $mysqli->query("

DELETE  ca
FROM    contacts ca
     LEFT JOIN
  (
 SELECT MAX(id) id, name1, tel, email
            FROM    contacts
            GROUP   BY name1, tel, email
        ) cb ON  ca.id = cb.id AND 
                ca.name1 = cb.name1 AND
                ca.tel = cb.tel AND
                ca.email = cb.email
WHERE   cb.id IS NULL

");

?>

表格示例:

ID   title   name1   name2    tel      email
1            John             01234    1@1.com
2    Mr      John     Smith   01234    1@1.com
3            John             01234    1@1.com

我的查询将删除记录1和2.我想只保留nr 2并删除1和3。 我怎么能做到这一点?有可能吗?或者也许我应该涉及PHP,如果是这样的话?

4 个答案:

答案 0 :(得分:1)

OOPS - 这是我迄今为止产生的最糟糕的答案 - 警告最高位是危险的,不知道为什么我没有包括任何组 - 请继续到底部,现在正在工作:

DELETE FROM contacts WHERE ID IN (
  SELECT ID FROM (
    SELECT DISTINCT a.ID
    FROM contacts AS a
    JOIN contacts AS b
    ON  a.name1 = b.name1
    AND a.tel = b.tel
    AND a.email = b.email
    ORDER BY a.name1 DESC, a.name2 DESC, a.title DESC
    LIMIT 1,100000
  ) AS tmp
)

LIMIT必须为1,xxxx - 不为0,xxxx保留第一个未删除

由于您无法直接从子查询中找到的同一个表中删除,只需添加一个遮罩层,因此现在已经过测试了

删除之前,请务必仔细检查要删除的内容:

SELECT * FROM contacts WHERE ID IN (
  SELECT ID FROM (
    SELECT DISTINCT a.ID
    ...
    LIMIT 1,100000
  ) AS tmp
)

对于损坏道歉,幸运的是你在测试db上做了

=====================================

现在这是正确的解决方案:

让我们检查一下测试表中的内容:

enter image description here

根据问题,我们发现只有#2#4#5才能保持良好状态。这是结果:

enter image description here

我们要删除上面列表中没有的任何记录,在删除之前,我们会仔细检查要删除的内容:

enter image description here

我们准备删除:

enter image description here

这是SQL,请确保首先测试db:

DELETE FROM contacts WHERE ID NOT IN (
  SELECT * FROM (
    SELECT ID FROM (
      SELECT * FROM contacts ORDER BY title DESC, name1 DESC, name2 DESC, tel DESC, email DESC
    ) AS tmp
    GROUP BY name1, tel, email
  ) AS del
)

答案 1 :(得分:1)

order by中使用group_concat,您可以尝试:

DELETE c1 FROM contacts c1
JOIN (
    SELECT 
        substring_index(group_concat(id ORDER BY ((title IS NULL OR title ='') AND (name2 IS NULL OR name2 = '')), id DESC), ',', 1) AS id,
        name1, tel, email
    FROM contacts
    GROUP BY name1, tel, email
) c2
ON c1.name1 = c2.name1 AND c1.tel = c2.tel AND c1.email = c2.email AND c1.id <> c2.id;

Demo Here

答案 2 :(得分:1)

我使用NOT EXIST子句而不是NOT IN

获得解决方案
DELETE FROM contacts 
WHERE NOT EXISTS (
    SELECT 1 FROM (
       SELECT * FROM ( 
          SELECT * FROM contact AS tmp ORDER BY title DESC, name1 DESC, name2 DESC, email DESC, tel DESC ) 
       as tbl group by name1) 
    as test WHERE contact.id= test.id 
)

答案 3 :(得分:0)

此查询无需任何订购选项即可使用!

DELETE FROM contacts where ID NOT IN (
    SELECT ID FROM ( Select A.ID from contacts as A 
                    join contacts AS B 
                    ON A.name1 = B.name1 
                    AND A.name2 = B.name2
                    AND A.tel = B.tel 
                    AND A.email = B.email) As mytry);