我有一个带有以下各列的“票”表:
voter
,election_year
,election_type
,party
我需要删除voter
和election_year
的组合中所有重复的行,并且在弄清楚如何执行此操作时遇到了麻烦。
我运行了以下内容:
WITH CTE AS(
SELECT voter,
election_year,
ROW_NUMBER()OVER(PARTITION BY voter, election_year ORDER BY voter) as RN
FROM votes
)
DELETE
FROM CTE where RN>1
基于另一个StackOverflow答案,但这似乎是特定于SQL Server的。我已经看到了使用唯一ID来执行此操作的方法,但是此特定表没有那么豪华。如何采用上述脚本删除需要的重复项?谢谢!
编辑:根据请求,创建带有一些示例数据的表:
CREATE TABLE public.votes
(
voter varchar(10),
election_year smallint,
election_type varchar(2),
party varchar(3)
);
INSERT INTO votes
(voter, election_year, election_type, party)
VALUES
('2435871347', 2018, 'PO', 'EV'),
('2435871347', 2018, 'RU', 'EV'),
('2435871347', 2018, 'GE', 'EV'),
('2435871347', 2016, 'PO', 'EV'),
('2435871347', 2016, 'GE', 'EV'),
('10215121/8', 2016, 'GE', 'ED')
;
答案 0 :(得分:3)
从CTE中删除或更新CTE在Postgres中不起作用,请参见"PostgreSQL with-delete “relation does not exists”"的可接受答案。
由于没有主键,您可以(ab)使用ctid
伪列来标识要删除的行。
WITH
cte
AS
(
SELECT ctid,
row_number() OVER (PARTITION BY voter,
election_year
ORDER BY voter) rn
FROM votes
)
DELETE FROM votes
USING cte
WHERE cte.rn > 1
AND cte.ctid = votes.ctid;
并且可能考虑引入主键。
答案 1 :(得分:2)
这是一个选择
DELETE FROM votes T1
USING votes T2
WHERE T1.ctid < T2.ctid
AND T1.voter = T2.voter
AND T1.election_year = T2.election_year;
答案 2 :(得分:0)
ctid字段是每个PostgreSQL表中都存在的字段,并且对于表中的每个记录都是唯一的,表示元组的位置。 您几乎没做错,只需要ctid,因为每一行都没有唯一的ID
;WITH CTE AS(
SELECT ctid,voter,
election_year,
ROW_NUMBER()OVER(PARTITION BY voter, election_year ORDER BY voter) as RN
FROM votes
)
delete FROM votes v where v.ctid in (select CTE.ctid from CTE where CTE.RN>1)