删除ONE表的SQL结果集中的重复项

时间:2013-11-04 18:13:20

标签: sql ms-access duplicate-removal

下午/晚上,

我正在寻找以下查询的最后修改。我需要删除特定行中列的重复出现。目前使用以下SQL:

SELECT CBNEW.*
FROM CallbackNewID CBNEW 
INNER JOIN (SELECT IDNEW, MAX(CallbackDate) AS MaxDate 
FROM CallbackNewID 
GROUP BY IDNEW) AS groupedCBNEW 
ON (CBNEW.CallbackDate = groupedCBNEW.MaxDate) AND (CBNEW.IDNEW = groupedCBNEW.IDNEW);

我的结果集如下所示

ID     RecID  Comp  Rem Date_               IDNEW   IDOLD   CB? CallbackDate
138618  83209   1   0   2012-03-16 12:40:00 83209   83209   2   16-Mar-12
138619  83209   1   0   2012-03-16 12:40:00 83209   83209   2   16-Mar-12
110470  83799   1   0   2011-07-27 11:46:00 83799   83799   10  27-Jul-11
110471  83799   1   0   2011-07-27 11:46:00 83799   83799   10  27-Jul-11

然而,这给了我CallBackDate和IDNEW列中的重复值,因为在表中有一些具有相同IDNEW和CallbackDate值的不同主键。

如果我将此结果转储到Excel中,我可以在第一个ID列上使用remove duplicates,问题就解决了。

但我想要做的是确保我的结果只包含ID列的FIRST实例,其中IDNEW和CallbackDate是重复的。

我确定我只需要添加一小段SQL,但如果能找到答案,我就会陷入困境。

非常感谢您的帮助。

4 个答案:

答案 0 :(得分:1)

尝试将MIN(ID)添加到内部查询中,然后将其添加到ON子句中:

SELECT CBNEW.*
FROM CallbackNewID CBNEW 
INNER JOIN (SELECT IDNEW, MIN(ID) AS MinId, MAX(CallbackDate) AS MaxDate 
FROM CallbackNewID 
GROUP BY IDNEW) AS groupedCBNEW 
ON (CBNEW.CallbackDate = groupedCBNEW.MaxDate) 
   AND (CBNEW.IDNEW = groupedCBNEW.IDNEW)
   AND (CBNEW.ID = groupedCBNEW.MinId) ;

sqlfiddle demo

答案 1 :(得分:1)

这是一种相当“蛮力”的方法。只需获取原始查询的结果,并在[ID]上Min(),在[Comp]和[Rem]上Max(),在其他所有内容上GROUP BY

SELECT 
    Min(t.ID) AS MinOfID, 
    t.RecID, 
    Max(t.Comp) AS MaxOfComp, 
    Max(t.Rem) AS MaxOfRem, 
    t.Date_, 
    t.IDNEW, 
    t.IDOLD, 
    t.[CB?], 
    t.CallbackDate
FROM 
    (
        SELECT CBNEW.*
        FROM 
            CallbackNewID CBNEW 
            INNER JOIN 
            (
                SELECT IDNEW, MAX(CallbackDate) AS MaxDate 
                FROM CallbackNewID 
                GROUP BY IDNEW
            ) AS groupedCBNEW 
                ON (CBNEW.CallbackDate = groupedCBNEW.MaxDate) 
                AND (CBNEW.IDNEW = groupedCBNEW.IDNEW)
    ) t
GROUP BY 
    t.RecID, 
    t.Date_, 
    t.IDNEW, 
    t.IDOLD, 
    t.[CB?], 
    t.CallbackDate;

它可能不是非常优雅,但如果它有效......

答案 2 :(得分:0)

在MS SQL Server中,我认为您正在寻找ROW_NUMBER()函数。

这样的事情可以帮助你得到你想要的东西:

SELECT
    X.*
FROM
    (
        SELECT
            *,
            ROW_NUMBER() OVER (PARTITION BY DBNEW.IDNEW, DBNEW.MaxDate) [row_num]
        FROM
            CallbackNewID CBNEW 
            INNER JOIN 
            (
                SELECT
                    IDNEW,
                    MAX(CallbackDate) AS MaxDate
                FROM
                    CallbackNewID 
                GROUP BY
                    IDNEW
            ) AS groupedCBNEW ON (CBNEW.CallbackDate = groupedCBNEW.MaxDate) AND (CBNEW.IDNEW = groupedCBNEW.IDNEW)
    ) X
WHERE
    X.row_num = 1

答案 3 :(得分:0)

SELECT
    A.*
FROM
    (SELECT
            *,
            ROW_NUMBER() OVER (PARTITION BY IDNEW ORDER BY CallbackDate DESC)
                          AS [row_num]
     FROM CallbackNewID 
    ) A
WHERE
    A.row_num = 1