删除某些列中具有重复值的记录?

时间:2014-07-21 18:09:53

标签: sql sql-server duplicate-removal

我有以下表格STG,有200万条记录 -

STG(ACCT_NUM,NAME,ADDRESS,CITY,STATE)

我添加了列SSN nvarchar(255) NULL并重新插入了200万条记录。现在我要删除 重复 记录,其中在SSN中有任何值,但其他列值匹配,因为SSN被插入为null对于一些记录。

我想删除所有其他列值匹配但不包含SSN的重复记录。有一些空SSN的唯一记录。我不希望它们被删除。

3 个答案:

答案 0 :(得分:0)

这应该是删除SSN为NULL值的记录所需的全部内容。

DELETE FROM STG
WHERE SSN IS NULL;

编辑:

这是一个MySQL解决方案,针对您尝试做的事情,给出以下评论:

DELETE FROM STG
    WHERE SSN IS NULL
    AND (ACCT_NUM, NAME, ADDRESS, CITY, STATE)
    IN (SELECT ACCT_NUM, NAME, ADDRESS, CITY, STATE FROM 
        (SELECT * FROM STG) AS STG1 WHERE SSN IS NOT NULL);

答案 1 :(得分:0)

由于信息不完整且要求目前存在冲突,因此在此进行猜测。

with MyDeleteCTE as
(
    SELECT 
        ACCT_NUM
        ,NAME
        ,ADDRESS
        ,CITY
        ,STATE
    FROM STG s1
    left join STG s2 on 
        s1.ACCT_NUM = s2.ACCT_NUM
        and s1.NAME = s2.NAME
        and s1.ADDRESS = s2.ADDRESS
        and s1.CITY = s2.CITY
        and s1.STATE = s2.STATE
        and s1.SSN <> s2.SSN
)

delete MyDeleteCTE

答案 2 :(得分:0)

DELETE s1
    FROM STG s1
    INNER JOIN STG s2 on 
        s1.ACCT_NUM = s2.ACCT_NUM
        and s1.NAME = s2.NAME
        and s1.ADDRESS = s2.ADDRESS
        and s1.STATE = s2.STATE
        and s1.CITY = s2.CITY
        and s1.SSN <> s2.SSN
        and (isnull(s1.SSN,'')='' OR s1.SSN is null)