我有一张包含许多重复记录的表格:
shop
ID tax_id
1 10
1 10
1 11
2 10
2 12
2 10
2 10
我想在不创建临时表的情况下删除所有重复记录。 在更新查询之后,该表应如下所示:
shop
ID tax_id
1 10
1 11
2 10
2 12
答案 0 :(得分:5)
这是一个就地解决方案(但不是单行)
找出最大ID:
select max(id) as maxid
from shop;
记住这个值。假设它等于1000;
使用offset:
重新插入唯一值insert into shop (id, tax_id)
select distinct id + 1000, tax_id
from shop;
删除旧值:
delete from shop
where id <= 1000;
恢复正常ID:
update shop
set id = id - 1000;
PROFIT!
答案 1 :(得分:5)
工作解决方案。
//Sql query to find duplicates
SELECT id, tax_id, count(*) - 1 AS cnt
FROM shop
GROUP BY id
HAVING cnt > 1
--- res
+------+--------+-----+
| id | tax_id | cnt |
+------+--------+-----+
| 1 | 10 | 2 |
| 2 | 10 | 3 |
+------+--------+-----+
//Iterate through results with your language of choice
DELETE
FROM shop
WHERE id=<res id>
AND tax_id=<res tax_id>
LIMIT <cnt - 1>
---res (iterated)
+------+--------+
| id | tax_id |
+------+--------+
| 1 | 10 |
| 1 | 11 |
| 2 | 12 |
| 2 | 10 |
+------+--------+
这两个查询需要一小段php才能执行删除
$res = mysql_query("SELECT id, tax_id, count(*) - 1 AS cnt
FROM shop
GROUP BY id
HAVING cnt > 1")
while($row = mysql_fetch_assoc($res)){
mysql_query("DELETE
FROM shop
WHERE id=".$row['id']."
AND tax_id=". $row['tax_id']."
LIMIT ".$row['cnt'] -1 . ");
}
编辑:最近重新审视了这个值,这是一个使用临时列的替代解决方案,无需使用脚本语言。
ALTER TABLE shop ADD COLUMN place INT;
SET @i = 1
UPDATE shop SET place = @i:= @i + 1;
DELETE FROM shop WHERE place NOT IN (SELECT place FROM items GROUP BY id, tax_id);
ALTER TABLE shop DROP COLUMN place;
答案 2 :(得分:3)
首先,您可以通过在这两个字段上创建唯一索引来防止这种情况,以供将来参考。
至于解决方案,在mysql中创建一个具有相同结构的新表shopnew
,或者只是在生成recordList时删除表中的每条记录(确保你有备份!):
//Get every record from mysql
$sSQL = "Select ID, tax_id from shop";
$oRes = mysql_query($sSQL);
$aRecordList = array();
while($aRow = mysql_fetch_assoc($oRes)){
//If record is a duplicate, it will be 'overwritten'
$aRecordList[$aRow['id'].".".$aRow['tax_id']] =1;
}
//You could delete every record from shop here, if you dont want an additional table
//recordList now only contains unique records
foreach($aRecordList as $sRecord=>$bSet){
$aExpRecord = explode(".",$sRecord);
mysql_query("INSERT INTO shopnew set id=".$aExpRecord[0].", tax_id = ".$aExpRecord[1]
}
答案 3 :(得分:3)
也许这会有所帮助:
$query="SELECT * FROM shop ORDER BY id";
$rez=$dbh->query($query);
$multi=$rez->fetchAll(PDO::FETCH_ASSOC);
foreach ($multi as $key=>$row){
$rest=array_slice($multi,$key+1);
foreach ($rest as $rest){
if(($row['id']==$rest['id']) && ($row['tax_id']==$rest['tax_id'])){
$dbh->query("DELETE FROM shop WHERE id={$rest['id']} and tax_id= {$rest['tax_id']}");
}
}
}
首先foreach
遍历每一行,第二行进行比较。
我正在使用PDO,但当然,你可以用程序方式来做。
答案 4 :(得分:2)
实际上,目前的局限性问题是一个非常棘手的挑战。我整个晚上都想到了解决方案(理解解决方案永远不会有用)。我不会在野外使用这个解决方案,我只是试图找出是否可以只使用MySQL。
我的提法中的问题:是否可以编写一系列DELETE语句,这些语句将从没有唯一约束的双列表中删除重复的行?
问题:
DELETE
的{{1}}形式只能有ORDER BY
子句而不是支持WHERE
。也就是说,在满足条件后应用订单。假设我们有一张表:
HAVING
我添加了一个键(不是UNIQUE或PRIMARY),以便更快地进行查找,并希望在分组中使用它。
您可以为表格提供一些值:
CREATE TABLE `tablename` (
`a_id` int(10) unsigned NOT NULL,
`b_id` int(10) unsigned NOT NULL,
KEY `Index_1` (`a_id`,`b_id`)
) ENGINE=InnoDB COLLATE utf8_bin;
作为副作用,键成为覆盖索引,当我们从表中创建SELECT时,显示的值被排序,但是当我们进行删除时,将按照我们插入的顺序读取值。
现在,让我们看看以下查询:
INSERT INTO tablename (a_id, b_id) VALUES (2, 3), (1, 1), (2, 2), (1,4);
INSERT INTO tablename (a_id, b_id) VALUES (2, 3), (1, 1), (2, 2), (1,4);
INSERT INTO tablename (a_id, b_id) VALUES (2, 3), (1, 1), (2, 2), (1,4);
结果:
SELECT @c, @a_id as a, @b_id as b, a_id, b_id
FROM tablename, (SELECT @a_id:=0, @b_id:=0, @c:=0) as init
WHERE (@c:=IF(LEAST(@a_id=(@a_id:=a_id), @b_id=(@b_id:=b_id)), @c+1, 1)) >= 1
;
结果使用@c, a, b, a_id, b_id
1, 1, 1, 1, 1
2, 1, 1, 1, 1
3, 1, 1, 1, 1
1, 1, 4, 1, 4
2, 1, 4, 1, 4
3, 1, 4, 1, 4
1, 2, 2, 2, 2
2, 2, 2, 2, 2
3, 2, 2, 2, 2
1, 2, 3, 2, 3
2, 2, 3, 2, 3
3, 2, 3, 2, 3
自动排序,重复对Index_1
列在(a_id, b_id)
列中。现在我们的任务是删除@c
所有的行。我们唯一的问题是强制MySQL在删除时使用@c > 1
,这在不应用其他条件的情况下相当棘手。但是我们可以通过在Index_1
上使用等式检查或多次相等检查来实现这一点:
a_id
我无法将所有可能的DELETE FROM t
USING tablename t FORCE INDEX (Index_1)
JOIN (SELECT @a_id:=0, @b_id:=0, @c:=0) as init
WHERE a_id IN (1)
AND (@c:=IF(LEAST(@a_id=(@a_id:=a_id), @b_id=(@b_id:=b_id)), @c+1, 1)) > 1;
DELETE FROM t
USING tablename t FORCE INDEX (Index_1)
JOIN (SELECT @a_id:=0, @b_id:=0, @c:=0) as init
WHERE a_id IN (2)
AND (@c:=IF(LEAST(@a_id=(@a_id:=a_id), @b_id=(@b_id:=b_id)), @c+1, 1)) > 1;
SELECT * FROM tablename t;
a_id, b_id
1, 1
1, 4
2, 2
2, 3
放在a_id
中,因为MySQL会理解索引在这种情况下是无用的,并且查询不会删除所有重复项(仅相邻),但是要说10个不同IN()
我可以删除两个DELETE语句中的重复项,每个IN将有5个显式ID。
希望,这可能对某人有用=)