我有几千个联系人的数据库,并希望删除所有重复的记录。我目前的SQL查询效果很好(在记录中 - tel ,电子邮件, name1 重复)。查询删除具有较低ID但最后发生记录的重复项。但在某些情况下,记录的其他字段已经填写(重要的是标题和 name2 )。我想要实现的是mysql检查这些字段是否已填写并仅保留记录了大部分信息的记录。
我的查询
<?php
$del_duplicate_contacts = $mysqli->query("
DELETE ca
FROM contacts ca
LEFT JOIN
(
SELECT MAX(id) id, name1, tel, email
FROM contacts
GROUP BY name1, tel, email
) cb ON ca.id = cb.id AND
ca.name1 = cb.name1 AND
ca.tel = cb.tel AND
ca.email = cb.email
WHERE cb.id IS NULL
");
?>
表格示例:
ID title name1 name2 tel email
1 John 01234 1@1.com
2 Mr John Smith 01234 1@1.com
3 John 01234 1@1.com
我的查询将删除记录1和2.我想只保留nr 2并删除1和3。 我怎么能做到这一点?有可能吗?或者也许我应该涉及PHP,如果是这样的话?
答案 0 :(得分:1)
OOPS - 这是我迄今为止产生的最糟糕的答案 - 警告最高位是危险的,不知道为什么我没有包括任何组 - 请继续到底部,现在正在工作: 强>
DELETE FROM contacts WHERE ID IN (
SELECT ID FROM (
SELECT DISTINCT a.ID
FROM contacts AS a
JOIN contacts AS b
ON a.name1 = b.name1
AND a.tel = b.tel
AND a.email = b.email
ORDER BY a.name1 DESC, a.name2 DESC, a.title DESC
LIMIT 1,100000
) AS tmp
)
LIMIT必须为1,xxxx - 不为0,xxxx保留第一个未删除
由于您无法直接从子查询中找到的同一个表中删除,只需添加一个遮罩层,因此现在已经过测试了
在删除之前,请务必仔细检查要删除的内容:
SELECT * FROM contacts WHERE ID IN (
SELECT ID FROM (
SELECT DISTINCT a.ID
...
LIMIT 1,100000
) AS tmp
)
对于损坏道歉,幸运的是你在测试db上做了
=====================================
现在这是正确的解决方案:
让我们检查一下测试表中的内容:
根据问题,我们发现只有#2#4#5才能保持良好状态。这是结果:
我们要删除上面列表中没有的任何记录,在删除之前,我们会仔细检查要删除的内容:
我们准备删除:
这是SQL,请确保首先测试db:
DELETE FROM contacts WHERE ID NOT IN (
SELECT * FROM (
SELECT ID FROM (
SELECT * FROM contacts ORDER BY title DESC, name1 DESC, name2 DESC, tel DESC, email DESC
) AS tmp
GROUP BY name1, tel, email
) AS del
)
答案 1 :(得分:1)
在order by
中使用group_concat
,您可以尝试:
DELETE c1 FROM contacts c1
JOIN (
SELECT
substring_index(group_concat(id ORDER BY ((title IS NULL OR title ='') AND (name2 IS NULL OR name2 = '')), id DESC), ',', 1) AS id,
name1, tel, email
FROM contacts
GROUP BY name1, tel, email
) c2
ON c1.name1 = c2.name1 AND c1.tel = c2.tel AND c1.email = c2.email AND c1.id <> c2.id;
答案 2 :(得分:1)
我使用NOT EXIST子句而不是NOT IN
获得解决方案DELETE FROM contacts
WHERE NOT EXISTS (
SELECT 1 FROM (
SELECT * FROM (
SELECT * FROM contact AS tmp ORDER BY title DESC, name1 DESC, name2 DESC, email DESC, tel DESC )
as tbl group by name1)
as test WHERE contact.id= test.id
)
答案 3 :(得分:0)
此查询无需任何订购选项即可使用!
DELETE FROM contacts where ID NOT IN (
SELECT ID FROM ( Select A.ID from contacts as A
join contacts AS B
ON A.name1 = B.name1
AND A.name2 = B.name2
AND A.tel = B.tel
AND A.email = B.email) As mytry);