重命名重复的行

时间:2011-03-03 03:07:59

标签: sql-server-2005 tsql common-table-expression sql-update

这是我的问题的简化示例。我有一张表,其中有一个名字"姓名"具有重复条目的列:

ID    Name
---   ----
 1    AAA
 2    AAA
 3    AAA
 4    BBB
 5    CCC
 6    CCC
 7    DDD
 8    DDD
 9    DDD
10    DDD

执行像SELECT Name, COUNT(*) AS [Count] FROM Table GROUP BY Name这样的GROUP BY会产生以下结果:

Name  Count
----  -----
AAA   3
BBB   1
CCC   2
DDD   4

我只关心重复项,因此我会添加一个HAVING子句SELECT Name, COUNT(*) AS [Count] FROM Table GROUP BY Name HAVING COUNT(*) > 1

Name  Count
----  -----
AAA   3
CCC   2
DDD   4

到目前为止琐碎,但现在事情变得棘手:我需要一个查询来获取所有重复的记录,但是在Name列中添加了一个很好的递增指示符。结果应如下所示:

ID    Name
---   --------
 1    AAA
 2    AAA (2)
 3    AAA (3)
 5    CCC 
 6    CCC (2)
 7    DDD 
 8    DDD (2)
 9    DDD (3)
10    DDD (4)

注意第4行" BBB"被排除,第一个副本保留原始名称。

使用EXISTS语句为我提供了所需的所有记录,但如何创建新的Name值?

SELECT * FROM Table AS T1 
WHERE EXISTS (
    SELECT Name, COUNT(*) AS [Count] 
    FROM Table 
    GROUP BY Name 
    HAVING (COUNT(*) > 1) AND (Name = T1.Name))
ORDER BY Name

我需要创建一个UPDATE语句来修复所有重复项,即根据此模式更改名称。

更新: 现在想出来。这是我失踪的PARTITION BY条款。

5 个答案:

答案 0 :(得分:12)

With Dups As
    (
    Select Id, Name
        , Row_Number() Over ( Partition By Name Order By Id ) As Rnk
    From Table
    )
Select D.Id
    , D.Name + Case
                When D.Rnk > 1 Then ' (' + Cast(D.Rnk As varchar(10)) + ')'
                Else ''
                End As Name
From Dups As D

如果你想要一个更新语句,你可以使用几乎相同的结构:

With Dups As
    (
    Select Id, Name
        , Row_Number() Over ( Partition By Name Order By Id ) As Rnk
    From Table
    )
Update Table
Set Name = T.Name + Case
                    When D.Rnk > 1 Then ' (' + Cast(D.Rnk As varchar(10)) + ')'
                    Else ''
                    End
From Table As T
    Join Dups As D
        On D.Id = T.Id

答案 1 :(得分:5)

直接更新子查询:

update d
set Name = Name+'('+cast(r as varchar(10))+')'
from    (   select  Name, 
                    row_number() over (partition by Name order by Name) as r
            from    [table]
        ) d
where r > 1

答案 2 :(得分:1)

SELECT ROW_NUMBER() OVER(ORDER BY Name) AS RowNum,
       Name,
       Name + '(' + ROW_NUMBER() OVER(PARTITION BY Name ORDER BY Name) + ')' concatenatedName
FROM Table 
WHERE Name IN 
(
     SELECT Name 
     FROM Table 
     GROUP BY Name 
     HAVING COUNT(*) > 1
)

这将为您提供您最初要求的内容。对于更新语句,您需要执行一段时间并更新前1个

DECLARE @Pointer VARCHAR(20), @Count INT

WHILE EXISTS(SELECT Name FROM Table GROUP BY Name HAVING COUNT(1) > 1)
BEGIN
    SELECT TOP 1 @Pointer = Name, @Count = COUNT(1) FROM Table GROUP BY Name HAVING COUNT(1) > 1
    UPDATE TOP (1) TABLE
    SET Name = Name + '(' + @Count + ')'
    WHERE Name = @Pointer
END

答案 3 :(得分:0)

根本不需要UPDATE。以下内容将根据需要为INSERT创建表格

SELECT
    ROW_NUMBER() OVER(ORDER BY tb2.Id) Id,
    tb2.Name + CASE WHEN COUNT(*) > 1 THEN ' (' + CONVERT(VARCHAR, Count(*)) + ')' ELSE '' END [Name]
FROM
    tb tb1,
    tb tb2
WHERE
    tb1.Name = tb2.Name AND
    tb1.Id <= tb2.Id
GROUP BY
    tb2.Name,
    tb2.Id

答案 4 :(得分:0)

这是一个更简单的UPDATE语句:

UPDATE
    tb
SET
    [Name] = [Name] + ' (' + CONVERT(VARCHAR, ROW_NUMBER () OVER (PARTITION BY [Name] ORDER BY Id)) + ')'
WHERE
    ROW_NUMBER () OVER (PARTITION BY [Name] ORDER BY Id) > 1