我有唯一的id
和email
字段。电子邮件得到重复。我只想保留所有重复项的一个电子邮件地址,但使用最新的id
(最后插入的记录)。
我怎样才能做到这一点?
答案 0 :(得分:75)
想象一下,您的表test
包含以下数据:
select id, email
from test;
ID EMAIL
---------------------- --------------------
1 aaa
2 bbb
3 ccc
4 bbb
5 ddd
6 eee
7 aaa
8 aaa
9 eee
因此,我们需要查找所有重复的电子邮件并删除所有这些电子邮件,但最新的ID
在这种情况下,重复aaa
,bbb
和eee
,因此我们要删除ID 1,7,2和6。
要做到这一点,首先我们需要找到所有重复的电子邮件:
select email
from test
group by email
having count(*) > 1;
EMAIL
--------------------
aaa
bbb
eee
然后,从这个数据集中,我们需要找到每个重复电子邮件的最新ID:
select max(id) as lastId, email
from test
where email in (
select email
from test
group by email
having count(*) > 1
)
group by email;
LASTID EMAIL
---------------------- --------------------
8 aaa
4 bbb
9 eee
最后,我们现在可以删除ID小于LASTID的所有这些电子邮件。所以解决方案是:
delete test
from test
inner join (
select max(id) as lastId, email
from test
where email in (
select email
from test
group by email
having count(*) > 1
)
group by email
) duplic on duplic.email = test.email
where test.id < duplic.lastId;
我现在没有在这台机器上安装mySql,但应该可以正常工作
以上删除有效,但我找到了更优化的版本:
delete test
from test
inner join (
select max(id) as lastId, email
from test
group by email
having count(*) > 1) duplic on duplic.email = test.email
where test.id < duplic.lastId;
您可以看到它删除了最旧的重复项,即1,7,2,6:
select * from test;
+----+-------+
| id | email |
+----+-------+
| 3 | ccc |
| 4 | bbb |
| 5 | ddd |
| 8 | aaa |
| 9 | eee |
+----+-------+
另一个版本是由Rene Limon
提供的删除delete from test
where id not in (
select max(id)
from test
group by email)
答案 1 :(得分:9)
正确的方法是
DELETE FROM `tablename`
WHERE id NOT IN (
SELECT * FROM (
SELECT MAX(id) FROM tablename
GROUP BY name
)
)
答案 2 :(得分:4)
尝试此方法
DELETE t1 FROM test t1, test t2
WHERE t1.id > t2.id AND t1.email = t2.email
答案 3 :(得分:3)
DELETE
FROM
`tbl_job_title`
WHERE id NOT IN
(SELECT
*
FROM
(SELECT
MAX(id)
FROM
`tbl_job_title`
GROUP BY NAME) tbl)
修订和工作版本!!!谢谢@Gaurav
答案 4 :(得分:1)
如果您要保留ID值最低的行:
DELETE n1 FROM 'yourTableName' n1, 'yourTableName' n2 WHERE n1.id > n2.id AND n1.email = n2.email
如果要保留具有最高id值的行:
DELETE n1 FROM 'yourTableName' n1, 'yourTableName' n2 WHERE n1.id < n2.id AND n1.email = n2.email
答案 5 :(得分:0)
我必须说优化版本是一个甜美,优雅的代码,即使在DATETIME列上执行比较,它也像魅力一样。这是我在我的脚本中使用的,我在那里搜索每个EmployeeID的最新合同结束日期:
declare @idoc int;
exec sp_xml_preparedocument @idoc out, @x;
select *
from openxml(@idoc, '')
exec sp_xml_removedocument @idoc;
非常感谢!
答案 6 :(得分:0)
我个人对前两个投票结果有疑问。这不是最干净的解决方案,但是您可以利用临时表来避免MySQL通过在同一表上进行连接删除而带来的所有问题。
"@editorjs/editorjs": "^2.17.0",
答案 7 :(得分:0)
<%= show_svg('icons/icon-menu.svg') %>
我创建的不错的存储过程用于删除表的所有重复记录,而无需该表上现有的唯一ID。
DELIMITER //
CREATE FUNCTION findColumnNames(tableName VARCHAR(255))
RETURNS TEXT
BEGIN
SET @colNames = "";
SELECT GROUP_CONCAT(COLUMN_NAME) FROM INFORMATION_SCHEMA.columns
WHERE TABLE_NAME = tableName
GROUP BY TABLE_NAME INTO @colNames;
RETURN @colNames;
END //
DELIMITER ;
DELIMITER //
CREATE PROCEDURE deleteDuplicateRecords (IN tableName VARCHAR(255))
BEGIN
SET @colNames = findColumnNames(tableName);
SET @addIDStmt = CONCAT("ALTER TABLE ",tableName," ADD COLUMN id INT AUTO_INCREMENT KEY;");
SET @deleteDupsStmt = CONCAT("DELETE FROM ",tableName," WHERE id NOT IN
( SELECT * FROM ",
" (SELECT min(id) FROM ",tableName," group by ",findColumnNames(tableName),") AS tmpTable);");
set @dropIDStmt = CONCAT("ALTER TABLE ",tableName," DROP COLUMN id");
PREPARE addIDStmt FROM @addIDStmt;
EXECUTE addIDStmt;
PREPARE deleteDupsStmt FROM @deleteDupsStmt;
EXECUTE deleteDupsStmt;
PREPARE dropIDStmt FROM @dropIDStmt;
EXECUTE dropIDstmt;
END //
DELIMITER ;
答案 8 :(得分:0)
我想根据表中的多列删除重复记录,所以这种方法对我有用,
第 1 步 - 从重复记录中获取最大 id 或唯一 id
select * FROM ( SELECT MAX(id) FROM table_name
group by travel_intimation_id,approved_by,approval_type,approval_status having
count(*) > 1
第 2 步 - 从表中获取单个记录的 id
select * FROM ( SELECT id FROM table_name
group by travel_intimation_id,approved_by,approval_type,approval_status having
count(*) = 1
第 3 步 - 从删除到排除以上 2 个查询
DELETE FROM `table_name`
WHERE
id NOT IN (paste step 1 query) a //to exclude duplicate records
and
id NOT IN (paste step 2 query) b // to exclude single records
最终查询:-
DELETE FROM `table_name`
WHERE id NOT IN (
select * FROM ( SELECT MAX(id) FROM table_name
group by travel_intimation_id,approved_by,approval_type,approval_status having
count(*) > 1) a
)
and id not in (
select * FROM ( SELECT id FROM table_name
group by travel_intimation_id,approved_by,approval_type,approval_status having
count(*) = 1) b
);
这个查询只会删除重复的记录。