在MS SQL中查找和更新特定的重复项

时间:2014-01-13 22:29:33

标签: sql sql-server database duplicate-removal

如下表所示:

+----+---------+-----------+-------------+-------+
| ID |  NAME   | LAST NAME |    PHONE    | STATE |
+----+---------+-----------+-------------+-------+
|  1 | James   | Vangohg   | 04333989878 | NULL  |
|  2 | Ashly   | Baboon    | 09898788909 | NULL  |
|  3 | James   | Vangohg   | 04333989878 | NULL  |
|  4 | Ashly   | Baboon    | 09898788909 | NULL  |
|  5 | Michael | Foo       | 02933889990 | NULL  |
|  6 | James   | Vangohg   | 04333989878 | NULL  |
+----+---------+-----------+-------------+-------+

我想使用MS SQL来查找和更新重复(基于名称,姓氏和数字),但只能查找更早的副本。所以上表的结果是:

+----+---------+-----------+-------------+-------+
| ID |  NAME   | LAST NAME |    PHONE    | STATE |
+----+---------+-----------+-------------+-------+
|  1 | James   | Vangohg   | 04333989878 | DUPE  |
|  2 | Ashly   | Baboon    | 09898788909 | DUPE  |
|  3 | James   | Vangohg   | 04333989878 | DUPE  |
|  4 | Ashly   | Baboon    | 09898788909 | NULL  |
|  5 | Michael | Foo       | 02933889990 | NULL  |
|  6 | James   | Vangohg   | 04333989878 | NULL  |
+----+---------+-----------+-------------+-------+

3 个答案:

答案 0 :(得分:2)

此查询使用CTE应用行号,其中任何数字> 1是具有最高ID的行的欺骗。

;WITH x AS 
(
  SELECT ID,NAME,[LAST NAME],PHONE,STATE,
    ROW_NUMBER() OVER (PARTITION BY NAME,[LAST NAME],PHONE ORDER BY ID DESC) 
  FROM dbo.YourTable
)
UPDATE x SET STATE = CASE rn WHEN 1 THEN NULL ELSE 'DUPE' END;

当然,我认为没有理由用这些信息实际更新表格;每次触摸表时,此数据都是陈旧的,必须重新应用查询。由于您可以在运行时获取此信息,因此这应该是查询的一部分,而不是在表中不断更新。 IMHO。

答案 1 :(得分:1)

试试这句话。

最后更新:

update t1
set
t1.STATE = 'DUPE'

from 

TableName t1

join 
(
    select name, last_name, phone, max(id) as id, count(id) as cnt
    from
    TableName
    group by name, last_name, phone
    having count(id) > 1
) t2 on ( t1.name = t2.name and t1.last_name = t2.last_name and t1.phone = t2.phone and t1.id < t2.id)

答案 2 :(得分:0)

如果我对您的要求的理解是正确的,那么当存在具有相同STATE值较高的另一行时,您希望将所有DUPE值更新为ID { {1}}和NAME。如果是这样,请使用:

LAST NAME