SQL删除重复项,仅保留其他列中具有最小值的记录

时间:2018-10-18 10:32:40

标签: sql ms-access

我正在尝试从表格中删除重复的订单,只保留发票日期最早的订单。我想出了类似的东西,但是运行非常缓慢。请记住,我正在使用MS Access 2010。

db.Execute "DELETE * FROM [PO Data] AS P1 WHERE [PO Number] = [PO Number] AND [Invoice Date] <> (SELECT MIN([Invoice Date]) FROM [PO Data] AS P2 WHERE P1.[PO Number] = P2.[PO Number])"
db.Execute "DELETE * FROM [PO Data] WHERE [PO Number] = [PO Number]"

有什么想法可以改善这一点吗?

2 个答案:

答案 0 :(得分:0)

此版本:

DELETE * FROM [PO Data] AS P1
    WHERE [PO Number] = [PO Number] AND
          [Invoice Date] <> (SELECT MIN([Invoice Date])
                             FROM [PO Data] AS P2
                             WHERE P1.[PO Number] = P2.[PO Number]
                            );

有一些奇怪的事情。为什么[PO Number] = [PO Number]?为什么<>

考虑此查询:

DELETE * FROM [PO Data] AS P1
    WHERE [Invoice Date] > (SELECT MIN([Invoice Date])
                            FROM [PO Data] AS P2
                            WHERE P1.[PO Number] = P2.[PO Number]
                           );

要加快查询速度,您需要在[PO Data]([PO Number], [Invoice Date])上建立索引。

编辑:

如果要最早的发票日期全部,只需删除相关子句:

DELETE * FROM [PO Data] AS P1
    WHERE [Invoice Date] > (SELECT MIN([Invoice Date])
                            FROM [PO Data] AS P2
                           );

答案 1 :(得分:0)

DELETE 
FROM [PO Data] a
WHERE [Invoice Date] > (SELECT MIN([Invoice Date]) FROM [PO Data] b
WHERE b.[PO Number]=a.[PO Number]);

OR

DELETE a
FROM [PO Data] a
INNER JOIN [PO Data] b
  ON b.[PO Number]=a.[PO Number] AND a.[Invoice Date]>b.[Invoice Date]

第二个更快。因为不需要执行MIN()函数。它也取决于您的索引和数据大小。如果重复行数很少,则可以应用子查询。