Question

CREATE join_table {
  id1 integer,
  id2 integer
}

我想创建一个UNIQ CONSTRAINT(id1, id2)，但是，我看到一些不好的数据，例如：

id1   | id2
------------
1     | 1
1     | 1
1     | 2

因此，记录（1,1）显然是重复的，并且会违反uniq约束。如何编写一个sql查询，它将删除表中的所有重复记录。

注意：我想删除其中一个副本，以便我可以创建uniq约束

Answer 1

这将保留其中一个副本：

delete from join_table
where ctid not in (select min(ctid)
                   from join_table
                   group by id1, id2);

您的表格没有可用于“挑选一名幸存者”的唯一标识符。这就是Postgres'ctid派上用场的地方，因为它是每行的内部唯一标识符。请注意，除了单个语句之外，不应使用ctid。它不是一个普遍独特的东西，但对于单个语句的运行时，它就可以了。

SQLFiddle示例：http://sqlfiddle.com/#!15/dabfc/1

如果您想要删除重复的所有行：

delete from join_table
where (id1, id2) in (select id1, id2
                     from join_table
                     group by id1, id2
                     having count(*) > 1);

这两种解决方案在大型桌面上都不会很快。如果你需要大表中的大量行，那么创建一个没有像jjanes所示的重复表的新表会快得多。

Answer 2

如果没有主键，那将很难做到。

现有的表是以FK约束命名的吗？如果没有，只需重新制作它。

begin;
create table new_table as select distinct * from join_table;
drop table join_table;
alter table new_table rename TO join_table;
commit;

如何删除表中的重复项？

2 个答案: