Question

我有如下表格

id | a_id | b_id | success
--------------------------
1    34      43      1
2    34      84      1
3    34      43      0
4    65      43      1
5    65      84      1
6    93      23      0
7    93      23      0

我希望使用相同的a_id和b_id删除重复项，但我想保留一条记录。如果可能的话，记录应该是成功= 1。因此在示例表中应删除第三和第六/第七条记录。怎么做？

我正在使用MySQL 5.1

Answer 1

任务很简单：

查找不应删除的最小记录数。
删除其他记录。

Oracle方式，

delete from sample_table where id not in(
select id from
(
 Select id, success,row_number()  
 over (partition by a_id,b_id order by success desc) rown 
 from sample_table
) 
where (success = 1 and rown = 1) or rown=1)

mysql中的解决方案：

将为您提供不应删除的最小ID。

Select id from (SELECT * FROM report ORDER BY success desc) t 
group by  t.a_id, t.b

O / P：

您可以删除其他行。

  delete from report where id not in (the above query)

合并后的DML：

delete from report 
  where id not in (Select id 
                    from (SELECT * FROM report 
                          ORDER BY success desc) t 
                    group by  t.a_id, t.b_id)

现在执行选择报告：

ID  A_ID    B_ID    SUCCESS
1   34       43     1
2   34       84     1
4   65       43     1
5   65       84     1
6   93       23     0

如果没有提供聚合功能，您可以查看group by子句如何工作的文档：

使用此功能时，每个组中的所有行都应该相同 GROUP BY部分中省略的列的值。该服务器可以自由地从组中返回任何值，因此结果是除非所有值都相同，否则不确定。

因此，只需在小组之前按照success执行订单，即可让我们使用success = 1获取第一个重复行。

Answer 2

这个怎么样：

CREATE TABLE new_table 
  AS (SELECT * FROM old_table WHERE 1 AND success = 1 GROUP BY a_id,b_id);
DROP TABLE old_table;
RENAME TABLE new_table TO old_table;

此方法将创建一个具有临时名称的新表，并从旧表中复制所有已成功= 1的重复数据删除行。然后删除旧表，并将新表重命名为旧表的名称。

如果我理解你的问题，这可能是最简单的解决方案。（虽然我不知道它是否真的有效）

Answer 3

这应该有效：

如果您可以使用程序编程，例如pl / sql很简单。另一方面，如果您正在寻找一个干净的SQL解决方案，它可能是可能的，但不是非常“好”。以下是pl / sql中的示例：

begin
  for x in ( select a_id, b_id
             from   table
             having count(*) > 1
             group  by a_id, b_id )
  loop
    for y in ( select *
               from   table
               where  a_id = x.a_id
               and    b_id = x.b_id
               order  by success desc )
    loop
      delete from table
      where  a_id = y.a_id
      and    b_id = y.b_id
      and    id != x.id;
      exit; // Only do the first row
    end loop;
  end loop;
end;

这是一个想法：对于a_id和b_id的每个重复组合，选择所有已排序的实例，以便任何成功= 1的实例首先出现。删除除第一个之外的所有组合 - 如果有的话是成功的组合。

或者也许：

declare
  l_a_id integer := -1;
  l_b_id integer := -1;
begin
  for x in ( select *
             from   table
             order  by a_id, b_id, success desc )
  loop
    if  x.a_id = l_a_id and x.b_id = l_b_id
    then
      delete from table where  id = x.id;
    end if;
    l_a_id := x.a_id;
    l_b_id := x.b_id;
  end loop;
end;

Answer 4

In MySQL, if you dont want to care about which record is maintained, a single alter table will work.

ALTER IGNORE TABLE tbl_name
ADD UNIQUE INDEX(a_id, b_id)

It ignores the duplicate records and maintain only the unique records.

A useful links : MySQL: ALTER IGNORE TABLE ADD UNIQUE, what will be truncated?

从db中删除重复项

4 个答案: