我有下面的表1:
----------------------------------
| Id | Value | Date |
----------------------------------
| 1 | xxx | 05/01/2015 |
| 2 | xxx | 05/02/2015 |
| 3 | yyy | 06/01/2015 |
| 4 | yyy | 06/01/2015 |
----------------------------------
使用最新日期删除重复的行,如果日期相等,则使用最新ID删除重复的行。 (换句话说,保留最新日期和最新ID,删除旧日期和ID)
不编程,仅查询。该表是多联接查询中的联接表之一。
应该与Vertica兼容。
答案 0 :(得分:1)
以下语句删除重复的行并保留最高ID:
DELETE t1 FROM table1 t1
INNER JOIN
table1 t2
WHERE
t1.id < t2.id AND t1. Date = t2. Date;
可能对您有帮助,您可以根据需要进行修改
答案 1 :(得分:0)
我认为Vertica将支持这一点:
delete table1
where table1.id not in (select t2.id
from (select t2.*
row_number() over (partition by t2.value order by t2.date, t2.id desc) as seqnum
from table1 t2
)
where seqnum = 1
);
答案 2 :(得分:0)
如果您想将此表与其他表连接,则可能只想拥有所需的行,而不必在连接前删除内容。
Vertica提供了 analytic limit子句,在这里可以派上用场。
以下是如何处理您的输入数据的方法:
WITH
input(Id,Value,Date) AS (
SELECT 1,'xxx',DATE '2015-05-01'
UNION ALL SELECT 2,'xxx',DATE '2015-05-02'
UNION ALL SELECT 3,'yyy',DATE '2015-06-01'
UNION ALL SELECT 4,'yyy',DATE '2015-06-01'
)
SELECT
*
FROM input
LIMIT 1 OVER(PARTITION BY Value ORDER BY Date DESC, id DESC);
-- out Id | Value | Date
-- out ----+-------+------------
-- out 2 | xxx | 2015-05-02
-- out 3 | yyy | 2015-06-01
-- out (2 rows)
-- out
-- out Time: First fetch (2 rows): 14.240 ms. All rows formatted: 14.276 ms
这个帮助...吗?
好吧,如果您确实需要删除,也可以在NOT IN谓词中使用以上内容来运行删除...就像我在这里所做的一样:
-- creating a temp table to delete from ....
CREATE LOCAL TEMPORARY TABLE t1 (Id,Value,Date)
ON COMMIT PRESERVE ROWS AS (
SELECT 1,'xxx',DATE '2015-05-01'
UNION ALL SELECT 2,'xxx',DATE '2015-05-02'
UNION ALL SELECT 3,'yyy',DATE '2015-06-01'
UNION ALL SELECT 4,'yyy',DATE '2015-06-01'
);
-- delete as announced ..
DELETE FROM t1 WHERE id NOT IN (
SELECT
id
FROM t1
LIMIT 1 OVER(PARTITION BY Value ORDER BY Date DESC, id DESC)
);
-- check the content now ...
SELECT * FROM t1;
-- out CREATE TABLE
-- out Time: First fetch (0 rows): 16.081 ms. All rows formatted:
-- 16.110 ms
-- out OUTPUT
-- out --------
-- out 2
-- out (1 row)
-- out
-- out Time: First fetch (1 row): 61.740 ms. All rows formatted:
-- 61.788 ms
-- out Id | Value | Date
-- out ----+-------+------------
-- out 2 | xxx | 2015-05-02
-- out 3 | yyy | 2015-06-01
-- out (2 rows)
-- out Time: First fetch (2 rows): 6.761 ms.
-- All rows formatted: 6.814 ms