PostgreSQL删除重复项

时间:2016-10-08 04:33:04

标签: postgresql

我正在使用postgres查询来删除表中的重复项。下表是动态生成的,我想写一个select查询,如果第一行有重复的值,它将删除记录。

表格看起来像这样

Ist col  2nd col
 4        62
 6        34
 5        26
 5        12

我想写一个select查询,删除第3行或第4行。

4 个答案:

答案 0 :(得分:2)

             select count(first) as cnt, first, second 
             from df1 
             group by first
             having(count(first) = 1)

如果你想保留其中一行(对不起,如果你想要,我最初会错过它):

             select first, min(second) 
             from df1 
             group by first

表的名称为df1且列名为firstsecond

如果需要,您实际上可以取消count(first) as cnt

冒着说明问题的风险,一旦你知道如何选择你想要(或不想要)的数据,删除记录就会很简单。

如果您想要替换表格或制作新表格,可以使用create table as进行删除:

             create table tmp as 
             select count(first) as cnt, first, second 
             from df1 
             group by first
             having(count(first) = 1);

             drop table df1;

             create table df1 as select * from tmp;

或使用DELETE FROM

DELETE FROM df1 WHERE first NOT IN (SELECT first FROM tmp);

你也可以使用select into等等。

答案 1 :(得分:2)

不需要中间表:

delete from df1
where ctid not in (select min(ctid)
                   from df1
                   group by first_column
                   having count(*) > 1);

如果要从大表中删除多行,则使用中间表的方法可能会更快。

如果您只想获得一列的唯一值,可以使用:

select distinct on (first_column) *
from the_table
order by the_table;

或者只是

select first_column, min(second_column)
from the_table
group by first_column;

答案 2 :(得分:1)

  • 如果您想要SELECT个唯一行:
SELECT * FROM ztable u
WHERE NOT EXISTS (      -- There is no other record
    SELECT * FROM ztable x
    WHERE x.id = u.id   -- with the same id
    AND x.ctid < u.ctid -- , but with a different(lower) "internal" rowid
    );                  -- so u.* must be unique
  • 如果您想SELECT在上一个查询中被抑制的其他行:
SELECT * FROM ztable nu
WHERE EXISTS (           -- another record exists
    SELECT * FROM ztable x
    WHERE x.id = nu.id   -- with the same id
    AND x.ctid < nu.ctid -- , but with a different(lower) "internal" rowid
    );
  • 如果您想要DELETE条记录,请使表格唯一(但每个ID保留一条记录):
DELETE FROM ztable d
WHERE EXISTS (          -- another record exists
    SELECT * FROM ztable x
    WHERE x.id = d.id   -- with the same id
    AND x.ctid < d.ctid -- , but with a different(lower) "internal" rowid
    );

答案 3 :(得分:0)

所以基本上我做了这个

 create temp t1 as 
 select first, min (second) as second
 from df1 
 group by first

 select * from df1 
 inner join t1 on t1.first = df1.first and t1.second = df1.second

这是一个令人满意的答案。感谢您的帮助@Hack-R