Question

我有一个InnoDB表，其中启用了主键和辅助唯一键。我使用load data infile转储超级大csv文件（200m记录）。然后我发现表中有重复的记录。这对我没有任何意义。我想知道这会发生什么？我在session和global上检查了unique_checks是否为“ON”。

我使用的加载数据infile查询：

load data infile "/tmp/test" replace into table temp.test fields terminated by ';' lines terminated by '\t\n' (first_name, last_name, birth_date, doc_number);

表架构是：

create table test(
  id int(10) not null auto_increment,
  first_name varchar(30) not null default '',
  last_name varchar(30) not null default '',
  birth_date datetime null default null,
  doc_number int(10) not null default '',
  primary key (id, first_name),
  unique key (first_name, last_name, birth_date, doc_number),
  partition by range(id)
  PARTITION p0 VALUES LESS THAN (100,000,000) ENGINE = InnoDB,
  PARTITION p1 VALUES LESS THAN (400,000,000) ENGINE = InnoDB,
  PARTITION p2 VALUES LESS THAN (700,000,000) ENGINE = InnoDB,
  PARTITION p3 VALUES LESS THAN (maxvalue) ENGINE = InnoDB
)

我找到的重复记录：

select * from temp.test where first_name = 'John' and last_name = 'Doe';
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+ id ++ first_name ++ last_name ++        birth_date       ++ doc_number +
+ 3  ++    John    ++    Doe    ++   1967-05-04 00:00:00   ++    1843    +
+ 97 ++    John    ++    Doe    ++   1967-05-04 00:00:00   ++    1843    +
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

我尝试使用较小的数据集重现此问题，但它无法正常工作。所以我现在尝试在原始数据集上重现它。但它对我来说没有意义，因为我在桌子上有独特的钥匙。因此，进一步说明任何关于在哪里观察的建议或方向都会非常有帮助。谢谢！

mysql中的重复记录，启用了唯一键

0 个答案: