Question

表table1包含1500000行并包含80个字段，我想删除基于field1和field2的重复项，ID字段是唯一的，所以我使用了最大选项。

选项1：插入选项

insert into table2_unique 
select * from table1 a
where  a.id = ( select max(b.id) from table1 b
               where a.field1 = b.field1
               and a.field2 = b.field2 );

但由于以下错误，查询失败。

Error Code: 1206. The total number of locks exceeds the lock table size

解释声明：

id  select_type table   partitions  type    possible_keys   key key_len ref rows    filtered    Extra
1   INSERT  table2  NULL    ALL NULL    NULL    NULL    NULL    NULL    NULL    NULL
1   PRIMARY a   NULL    ALL NULL    NULL    NULL    NULL    1387764 100 Using where
2   DEPENDENT SUBQUERY  b   NULL    ref field1x,field2x field1x 39  a.field1    537 10  Using where

Option2 DELETE语句：

DELETE n1 FROM table1 n1, table1 n2 WHERE n1.id > n2.id AND n1.field1 = n2.field1 and n1.field2 and n2.field2

当我执行时，发生死锁。

我无法增加缓冲池大小，请让我知道我是否应该以不同方式编写查询。

Answer 1

我不确定它会如何影响锁，但在mysql中使用依赖子查询（即推送谓词）从来没有像我的经验那样好。我会把第一个查询写成：

insert into table2_unique (id, col1, col2, ...col79)
select a.id, a.col1, a.col2, ...a.col79
from table1 a
Inner join (
   Select max(b.id) as id
   From table1 b
   Group by b.col1, b.col2
 ) As dedup
 On a.id=dedup.id;

尝试使用连接更新表总是有点狡猾。当它自我加入时，它就不足为奇了。使用临时表并将操作分成两个步骤可以避免这种情况。

Answer 2

增加my.ini文件中的INNODB_BUFFER_POOL_SIZE，查询在指定的卷上运行27分钟

MYSQL插入查询失败，锁定总数超过锁定表大小

2 个答案: