我将纯文本文件中的数据导入到mysql数据库中。现在我发现我有重复的条目,我想删除。重复项由不是主键的密钥标识。请提醒我必须保留其中一个重复的项目。
表格T1,有三个副本,例如:
ID(唯一,主键)REAL_ID(char(11))
1 '01234567890' 2 '01234567891' 3 '01234567891' 4 '01234567891' ...
现在,我使用
SELECT ID AS x, COUNT(*) AS y FROM T1 GROUP BY x HAVING y>1;
识别重复项。结果我
+------+-------------+ | ID | REAL_ID | +------+-------------+ | 1 | 01234567891 | | 2 | 01234567891 | | 3 | 01234567891 | +------+-------------+
我甚至可以构建我必须删除的ID列表:
SELECT ID
FROM T1
RIGHT JOIN ( (SELECT ID AS x, COUNT(*) AS y
FROM T1
GROUP BY x
HAVING y>1) AS T2 ) ON (T2.x=T1.REAL_ID) LIMIT 1,100;
结果是
+------+-------------+ | ID | REAL_ID | +------+-------------+ | 2 | 01234567890 | | 3 | 01234567890 | +------+-------------+
现在,我需要有关如何删除这些条目的帮助。
由于无法将DELETE与子查询结合使用,我试图标记REAL_ID列中的所有重复条目,然后使用
DELETE FROM T1 WHERE REAL_ID='flag';
但我无法弄清楚如何标记这些条目。
答案 0 :(得分:0)
你可以这样做:
DELETE t
FROM T1 t
RIGHT JOIN
(
SELECT ID, COUNT(*) AS y
FROM T1
GROUP BY ID
HAVING y > 1
) AS T2 ON T2.ID = t.REAL_ID;
更新:请注意,从多个表格中删除时,您无法使用LIMI
或ORDER BY
,引自DELETE
:
对于多表语法,DELETE从每个tbl_name中删除 满足条件的行。在这种情况下,ORDER BY和LIMIT 不能使用
请改为尝试:
DELETE t
FROM T1 t
WHERE REAL_ID IN
(
SELECT t1.REAL_ID
FROM T1 t1
RIGHT JOIN
(
SELECT ID, COUNT(*) AS y
FROM T1
GROUP BY ID
HAVING y > 1
) AS T2 ON T2.ID = t.REAL_ID
WHERE t1.REAL_ID IS NOT NULL
ORDER BY t2.y DESC
LIMIT 1, 1000
)