删除MySQL中的重复项

时间:2010-02-01 12:49:49

标签: mysql

我有一张这样的表

userid  visitorid   time
1       10          2009-12-23
1       18          2009-12-06
1       18          2009-12-14
1       18          2009-12-18
1705    1678        2010-01-24
1705    1699        2010-01-24
1705    1700        2010-01-24
1712    1           2010-01-25
1712    640         2010-01-24
1712    925         2010-01-25
1712    1600        2010-01-24
1712    1630        2010-01-25
1712    1630        2010-01-24
1713    1           2010-01-24
1713    1           2010-01-23

我想执行一个查询,以便删除除最新副本之外的所有重复项。我希望你有个主意吗?

示例,查询后表必须像这样

userid  visitorid   time
1       10          2009-12-23
1       18          2009-12-18
1705    1678        2010-01-24
1705    1699        2010-01-24
1705    1700        2010-01-24
1712    1           2010-01-25
1712    640         2010-01-24
1712    925         2010-01-25
1712    1600        2010-01-24
1712    1630        2010-01-25
1713    1           2010-01-24

4 个答案:

答案 0 :(得分:4)

Delete from YourTable VersionA
  where VersionA.Time NOT IN
    ( select MAX( VersionB.Time ) Time
         from YourTable VersionB
         where VersionA.UserID = VersionB.UserID
           and VersionA.VisitorID = VersionB.VisitorID )

语法可能需要调整,但应该这样做。此外,您可能希望将Subselect预先查询到其自己的表FIRST中,然后针对该结果集运行DELETE FROM。

答案 1 :(得分:0)

假设您的表名为Visitors

DELETE v1.* FROM Visitors v1
LEFT JOIN (
    SELECT userid, visitorid, MAX(time) AS time
    FROM Visitors v2
    GROUP BY userid, visitorid
) v3 ON v1.userid=v3.userid AND v1.visitorid=v3.visitorid AND v1.time = v3.time
WHERE v3.userid IS NULL;

答案 2 :(得分:0)

DELETE  mo.*
FROM    (
        SELECT  userid, visitorid, MAX(time) AS mtime
        FROM    mytable
        GROUP BY
                userid, visitorid
        ) mi
JOIN    mytable mo
ON      mo.userid = mi.userid
        AND mo.visitorid = mo.visitorid
        AND mo.time < mi.mtime

答案 3 :(得分:0)

您需要使用双嵌套子查询解决MySQL bug#6980

DELETE FROM foo_table
WHERE foo_table.time IN (
    SELECT time FROM (
        SELECT time FROM
            foo_table
            LEFT OUTER JOIN (
                SELECT MAX(time) AS time
                FROM foo_table
                GROUP BY userid, visitorid
                ) AS foo_table_keep
                    USING (time)
        WHERE
            foo_table_keep.time IS NULL
        ) AS foo_table_delete
    );

使用GROUP BY将重复项折叠为单行,MAX(time)选择您想要的值。如果需要,请使用除MAX之外的其他聚合函数。

将子查询包装两次,为每个子查询提供别名,避免错误:

ERROR 1093 (HY000): You can't specify target table 'foo_table' for update in FROM clause

并且具有额外的优势,即该陈述如何选择保留的内容更清楚。