我想在Postgres中知道如何删除所有重复的记录,但是通过按列排序来删除一个。
假设我有下表foo
:
id | name | region | created_at
--------------------+------+-----------+-------------------------------
1 | foo | sydney | 2018-05-24 15:40:32.593745+10
2 | foo | melbourne | 2018-05-24 17:28:59.452225+10
3 | foo | sydney | 2018-05-29 22:17:02.927263+10
4 | foo | sydney | 2018-06-13 16:44:32.703174+10
5 | foo | sydney | 2018-06-13 16:45:01.324273+10
6 | foo | sydney | 2018-06-13 17:04:49.487767+10
7 | foo | sydney | 2018-06-13 17:05:13.592844+10
我想通过检查(名称,区域)元组来删除所有重复项,但保留具有最大created_at
列的重复项。结果将是:
id | name | region | created_at
--------------------+------+-----------+-------------------------------
2 | foo | melbourne | 2018-05-24 17:28:59.452225+10
7 | foo | sydney | 2018-06-13 17:05:13.592844+10
但是我不知道从哪里开始。有什么想法吗?
答案 0 :(得分:1)
DELETE FROM foo
WHERE id IN
(SELECT id
FROM (SELECT id,
ROW_NUMBER ()
OVER (PARTITION BY region
ORDER BY created_at DESC)
row_no
FROM foo)
WHERE row_no > 1)
答案 1 :(得分:1)
使用带有ROW_NUMBER
和PARTITION BY
的子查询来过滤出具有重复区域的行,同时保留每个区域中的最新行。确保您的子查询使用AS
关键字来防止Postgre语法错误:
SELECT *
FROM foo
WHERE id IN (
SELECT a.id
FROM (
SELECT id, ROW_NUMBER() OVER (
PARTITION BY region
ORDER BY created_at DESC
) row_no
FROM foo
) AS a
WHERE row_no > 1
);
...返回要删除的行。对结果满意后,将SELECT *
替换为DELETE
,以删除行。