Deleting all but the most recent entry from single SQL table

时间:2015-09-01 22:44:00

标签: mysql sql

I have a single SQL table that contains multiple entries for each customerID (some customerID's only have one entry which I want to keep). I need to remove all but the most recent entry per customerID, using the invoiceDate field as my marker.

So I need to go from this:

+------------+-------------+-----------+
| customerID | invoiceDate | invoiceID |
+------------+-------------+-----------+
|          1 |  1393995600 |       xx  |
|          1 |  1373688000 |       xx  |
|          1 |  1365220800 |       xx  |
|          2 |  1265220800 |       xx  |
|          2 |  1173688000 |       xx  |
|          3 |  1325330800 |       xx  |
+------------+-------------+-----------+

To this:

+------------+-------------+-----------+
| customerID | invoiceDate | invoiceID |
+------------+-------------+-----------+
|          1 |  1393995600 |       xx  |
|          2 |  1265220800 |       xx  |
|          3 |  1325330800 |       xx  |
+------------+-------------+-----------+

Any guidance would be greatly appreciated!

4 个答案:

答案 0 :(得分:2)

  1. Write a query to select all the rows you want to delete:
SELECT * FROM t
WHERE invoiceDate NOT IN (
    SELECT MAX(invoiceDate)
    -- "FROM t AS t2" isn't supported by MySQL, see http://stackoverflow.com/a/14302701/227576
    FROM (SELECT * FROM t) AS t2
    WHERE t2.customerId = t.customerId
    GROUP BY t2.customerId
)

This may take a long time on a big database.

  1. If you're satisfied, change the query to a DELETE statement:
DELETE FROM t
WHERE invoiceDate NOT IN (
    SELECT MAX(invoiceDate)
    -- "FROM t AS t2" isn't supported by MySQL, see http://stackoverflow.com/a/14302701/227576
    FROM (SELECT * FROM t) AS t2
    WHERE t2.customerId = t.customerId
    GROUP BY t2.customerId
)

See http://sqlfiddle.com/#!9/6e031/1

If you have multiple rows whose date is the most recent for the same customer, you would have to look for duplicates and decide which one you want to keep yourself. For instance, look at customerId 2 on the SQL fiddle link above.

答案 1 :(得分:0)

Let us asume that the table name is transaction_table.

create table test1 AS
select * from (
  select * from transaction_table order by customerID, invoiceDate desc) temp
group by customerID

You will have the output data in test1 table.

答案 2 :(得分:0)

试试这个

  with todelete as
(
            select 
            CustomerId, InvoiceId, InvoiceDate, Row_Number() over (partition by CustomerId  order by InvoiceDate desc) as Count
             from DeleteDuplicate
)


delete from todelete
where count > 1

答案 3 :(得分:0)

delete from ex_4 where
rowid in
(select rowid
from ex_4 a 
where to_date(invoicedate,'DDMMYYYY') = (select max(to_date(invoicedate,'DDMMYYYY')) from ex_4 b where a.customerid != b.customerid))

这是如何在oracle中完成的。这个查询将删除所有但最近添加的row.Looking你的表结构我假设invoicedate列是varchar2类型所以将它转换为date使用to_date函数