我正在使用postgres查询来删除表中的重复项。下表是动态生成的,我想写一个select查询,如果第一行有重复的值,它将删除记录。
表格看起来像这样
Ist col 2nd col
4 62
6 34
5 26
5 12
我想写一个select查询,删除第3行或第4行。
答案 0 :(得分:2)
select count(first) as cnt, first, second
from df1
group by first
having(count(first) = 1)
如果你想保留其中一行(对不起,如果你想要,我最初会错过它):
select first, min(second)
from df1
group by first
表的名称为df1
且列名为first
和second
。
如果需要,您实际上可以取消count(first) as cnt
。
冒着说明问题的风险,一旦你知道如何选择你想要(或不想要)的数据,删除记录就会很简单。
如果您想要替换表格或制作新表格,可以使用create table as
进行删除:
create table tmp as
select count(first) as cnt, first, second
from df1
group by first
having(count(first) = 1);
drop table df1;
create table df1 as select * from tmp;
或使用DELETE FROM
:
DELETE FROM df1 WHERE first NOT IN (SELECT first FROM tmp);
你也可以使用select into
等等。
答案 1 :(得分:2)
不需要中间表:
delete from df1
where ctid not in (select min(ctid)
from df1
group by first_column
having count(*) > 1);
如果要从大表中删除多行,则使用中间表的方法可能会更快。
如果您只想获得一列的唯一值,可以使用:
select distinct on (first_column) *
from the_table
order by the_table;
或者只是
select first_column, min(second_column)
from the_table
group by first_column;
答案 2 :(得分:1)
SELECT
个唯一行:SELECT * FROM ztable u
WHERE NOT EXISTS ( -- There is no other record
SELECT * FROM ztable x
WHERE x.id = u.id -- with the same id
AND x.ctid < u.ctid -- , but with a different(lower) "internal" rowid
); -- so u.* must be unique
SELECT
在上一个查询中被抑制的其他行:SELECT * FROM ztable nu
WHERE EXISTS ( -- another record exists
SELECT * FROM ztable x
WHERE x.id = nu.id -- with the same id
AND x.ctid < nu.ctid -- , but with a different(lower) "internal" rowid
);
DELETE
条记录,请使表格唯一(但每个ID保留一条记录):DELETE FROM ztable d
WHERE EXISTS ( -- another record exists
SELECT * FROM ztable x
WHERE x.id = d.id -- with the same id
AND x.ctid < d.ctid -- , but with a different(lower) "internal" rowid
);
答案 3 :(得分:0)
所以基本上我做了这个
create temp t1 as
select first, min (second) as second
from df1
group by first
select * from df1
inner join t1 on t1.first = df1.first and t1.second = df1.second
这是一个令人满意的答案。感谢您的帮助@Hack-R