Question

我正在使用postgres查询来删除表中的重复项。下表是动态生成的，我想写一个select查询，如果第一行有重复的值，它将删除记录。

表格看起来像这样

Ist col  2nd col
 4        62
 6        34
 5        26
 5        12

我想写一个select查询，删除第3行或第4行。

Answer 1

             select count(first) as cnt, first, second 
             from df1 
             group by first
             having(count(first) = 1)

如果你想保留其中一行（对不起，如果你想要，我最初会错过它）：

             select first, min(second) 
             from df1 
             group by first

表的名称为df1且列名为first和second。

如果需要，您实际上可以取消count(first) as cnt。

冒着说明问题的风险，一旦你知道如何选择你想要（或不想要）的数据，删除记录就会很简单。

如果您想要替换表格或制作新表格，可以使用create table as进行删除：

             create table tmp as 
             select count(first) as cnt, first, second 
             from df1 
             group by first
             having(count(first) = 1);

             drop table df1;

             create table df1 as select * from tmp;

或使用DELETE FROM：

DELETE FROM df1 WHERE first NOT IN (SELECT first FROM tmp);

你也可以使用select into等等。

Answer 2

不需要中间表：

delete from df1
where ctid not in (select min(ctid)
                   from df1
                   group by first_column
                   having count(*) > 1);

如果要从大表中删除多行，则使用中间表的方法可能会更快。

如果您只想获得一列的唯一值，可以使用：

select distinct on (first_column) *
from the_table
order by the_table;

或者只是

select first_column, min(second_column)
from the_table
group by first_column;

Answer 3

如果您想要SELECT个唯一行：

SELECT * FROM ztable u
WHERE NOT EXISTS (      -- There is no other record
    SELECT * FROM ztable x
    WHERE x.id = u.id   -- with the same id
    AND x.ctid < u.ctid -- , but with a different(lower) "internal" rowid
    );                  -- so u.* must be unique

如果您想SELECT在上一个查询中被抑制的其他行：

SELECT * FROM ztable nu
WHERE EXISTS (           -- another record exists
    SELECT * FROM ztable x
    WHERE x.id = nu.id   -- with the same id
    AND x.ctid < nu.ctid -- , but with a different(lower) "internal" rowid
    );

如果您想要DELETE条记录，请使表格唯一（但每个ID保留一条记录）：

DELETE FROM ztable d
WHERE EXISTS (          -- another record exists
    SELECT * FROM ztable x
    WHERE x.id = d.id   -- with the same id
    AND x.ctid < d.ctid -- , but with a different(lower) "internal" rowid
    );

Answer 4

所以基本上我做了这个

 create temp t1 as 
 select first, min (second) as second
 from df1 
 group by first

 select * from df1 
 inner join t1 on t1.first = df1.first and t1.second = df1.second

这是一个令人满意的答案。感谢您的帮助@Hack-R

PostgreSQL删除重复项

4 个答案: