Question

在表中同时存在重复项的同时删除表中的数据的最佳方法是什么？有一种方法可以将重复的数据标识符存储到临时表中，然后将其删除。但这不是一种有效的方法。任何更好的主意将不胜感激。

我的桌子

CREATE TABLE account(  
  user_id serial PRIMARY KEY,  
  username VARCHAR(50) UNIQUE NOT NULL,  
  password VARCHAR (50) NOT NULL,  
  email VARCHAR(355) UNIQUE NOT NULL,  
  created_on TIMESTAMP NOT NULL,  
  last_login TIMESTAMP 
);

Answer 1

“立即”答案将是简单地运行DELETE语句，然后在单个事务中运行INSERT语句。

假设您要避免使用重复的用户名，那么您可以执行以下操作：

sql:column()

您可以将其合并为一个语句，但这没有什么大不同：

begin transaction
  delete from account
  where username = 'arthur';

  insert into account(username, password, email, created_on)
  values ('arthur', '****', 'arthur@h2g2.com', current_timestamp);
commit;

这里的唯一优点是您不需要重复两次值。

但是，如果其他表引用了with new_values (username, password, email, created_on) as ( values values ('arthur', '****', 'arthur@h2g2.com', current_timestamp); ), deleted as ( delete from the_table where username = (select username from new_values) ) insert into account select * from new_values;（即，外键“指向” account），那么它将无法正常工作，因为如果仍然引用该行，则DELETE将失败。

更好的解决方案是使用the_table，然后使用新数据更新现有行：

INSERT ON CONFLICT

但是，如果电子邮件已经存在，这仍然会引发错误，但是不幸的是，对于insert into account(username, password, email, created_on) values ('arthur', '****', 'arthur@h2g2.com', current_timestamp) on conflict (username) do update set password = excluded.password, email = excluded.email;，您只能指定一个唯一约束。

要处理两个不同的独特约束，事情会变得更加复杂：

on conflict do update

首先尝试插入。如果违反了任何唯一约束，则只需忽略插入（with new_values (username, password, email, created_on) as ( values ('arthur', '***', 'arthur@h2g2.com', current_timestamp) ), inserted as ( insert into account(username, password, email, created_on) select * from new_values on conflict do nothing returning id ) update account set password = nv.password from new_values nv where (account.username = nv.username or account.email = nv.email) and not exists (select * from inserted);）。

仅在上一步未插入任何行的情况下才执行最终UPDATE语句。这是通过on conflict do nothing实现的。

由于用户名或电子邮件可能导致违反约束，因此更新在这两列上使用或条件来更新现有行。如果需要，您还可以在那里更新更多列。

Answer 2

我认为一种非常简单的方法是在特定列上添加UNIQUE索引。在编写ALTER语句时，请包含IGNORE关键字。像这样：

ALTER IGNORE TABLE orders ADD UNIQUE INDEX idx_name (col1, col2, col3, others);

这将删除所有重复的行。另外一个好处是，将来重复出现的INSERT将出错。与往常一样，您可能需要先备份，然后再运行类似的操作...

希望这对您有所帮助。

如何使用单个SQL命令插入数据并删除重复的行

2 个答案: