从行群集中删除重复的键

时间:2014-02-25 01:31:17

标签: vba

我有一个由其他几个工作表构建的大工作表。工作表增长到数千行。有些列包含通常具有重复值的键。大多数非唯一键出现在连续行的簇中。我有代码在任何重复键集群中的第一个之后清除重复键。它简单,高效,当应用于数千行时,它会运行一段时间。我想让它跑得更快。我有一个算法,当密钥被排序时运行良好,因此每个唯一值的所有重复项都出现在一个且只有一组行中。我认为我需要的是使用一个唯一值列表,为每个唯一值提供高级过滤器。但是需要提供的是匹配行的列表,而不是匹配值列表。高级过滤器提供范围中的值,但是,当工作表显示每个值的相应行号时,该行号似乎不可从提供的范围中获得。

因此,给定列中的数据可能具有以下形式:

header
a
a
a
a
b
b
b
b
c
c
c
c
a
a
a
a
d
d
d
d

等等。

这个VBA运作良好但很慢:

Sub Delete_Dupes()  
    Dim k As Long  
    Dim j As Long  
    Dim i As Long  
    For j = 0 To Selection.Columns.count - 1  
        k = 0 
        For i = 1 To Selection.Rows.count Step 1  
            If Selection.Cells(1, 1).Offset(k, j).Value = _  
                Selection.Cells(1, 1).Offset(i, j).Value Then  
                    Selection.Cells(1, 1).Offset(i, j).Clear  
            Else 
                k = i  
            End If  
        Next i  
   Next j  
End Sub  

我正在使用高级过滤器进行改进,但无法将不相交的群集分开。

Sub DeleteDupes()  
'Assumption:  all unique items are grouped together on adjacent rows  
'heres how this works  
'get rid of the empty cells that the for each loop would find  
'change them to something wierd  
'then get the advanced filter list of unique items  
'for each unique item filter for all items of that type  
'  clear the n-1 items following the first  
'replace the something wierd with nothing 
    Dim first As Boolean  
    Dim replaceEmpty As String  
    Dim things As Range  
    Dim others As Range  
    Dim uniques As Collection  
    Set uniques = New Collection  
    Set things = Selection  
    replaceEmpth = "!%*****@^"  
    'have to get rid of "Empty" cells  
    Selection.Replace What:="", Replacement:=replaceEmpty, LookAt:=xlPart, SearchOrder:=xlByRows, MatchCase:=False    
    Selection.advancedFilter Action:=xlFilterInPlace, unique:=True  
    For Each cl In Selection.SpecialCells(xlCellTypeVisible)  
         uniques.Add cl.Value  
    Next cl  
    'this is interesting...  
    'the row heading always appears as part of the list of uniques  
    'and when filtering for a unique value the row heading is always the first entry  
    Z = 2                                                   'first value is never the row heading (1)  
    first = True  
    For Each cl In uniques  
        If first = False Then                                 'skip the row heading  
            Selection.AutoFilter field:=1, Criteria1:=cl        'filter for the unique value  
            y = Selection.SpecialCells(xlCellTypeVisible).count 'how many did you get?  
            If y > 2 Then                                       'need more than 2 to do any clearing  
                Range(Selection.SpecialCells(xlCellTypeVisible).Cells(1, 1).Offset(Z, 0), _
                Selection.SpecialCells(xlCellTypeVisible).Cells(1, 1).Offset(Z + y - 2, 0)).Clear  
                Z = Z + y - 1                                     'index past what was cleared  
            Else  
                Z = Z + 1                                         'index past single unique value  
            End If  
            ActiveSheet.ShowAllData  
        Else  
          first = False  
        End If  
    Next cl  
    Selection.Replace What:=replaceEmpty, Replacement:="", LookAt:=xlPart, SearchOrder:=xlByRows, MatchCase:=False  
End Sub  

range.deleteduplicates非常整洁

建立要清除的范围列表也非常简洁

我想我这两个都会运作良好。

谢谢!

这是我编写的解决方案,在贡献者的话中,似乎相当快

Sub x()
    Dim cl As Range
    Dim tgt As String
    昏暗的副本作为范围
    Dim clearrange As Range
    Dim x As Long
    Dim col As String
    Dim关键字As String
    Dim firstindex As Long
    Dim lastindex As Long
    Dim uniques As Collection
    D y y As String
    col =“A”
    设置uniques =新收藏
    表( “工作表Sheet1”)。激活
    x = ActiveSheet.UsedRange.Rows.Count
    tgt = col& “1:”& col& X
    ActiveSheet.Range(TGT)。选择
    Selection.AdvancedFilter action:= xlFilterInPlace,unique:= True
    对于每个cl In Range(tgt).SpecialCells(xlVisible)
        uniques.Add cl.Value'这是一个唯一值列表
    下一个cl
    Dim ucount As Long
    ucount = 0
    昏暗的名字作为变体
    对于每个名称在uniques'检查dupes每个唯一值
        如果ucount = 0那么
            ucount = 1
        否则
            范围(tgt).AutoFilter字段:= 1,Criteria1:= name
            firstindex = 0
            对于每个重复范围(tgt).SpecialCells(xlVisible)
                如果dups.Row<> 1然后'跳过第1行标题行
                    如果firstindex = 0那么'0->开始新的行组
                        firstindex = dups.Row
                        lastindex = firstindex
                    否则
                        如果dups.Row = lastindex + 1那么'是dup.row单调递增?
                            lastindex = dups.Row
                        否则,不会增加1,所以定义了范围                             y = col& firstindex + 1& “:”& col& lastIndex的
                            范围(Y).Clear
                            firstindex = dups.Row
                            lastindex = dups.Row
                        结束如果
                    结束如果
                结束如果
            下一个重复
            如果lastindex - firstindex> 0然后'检查最后一个dupe范围是否需要清除
                y = col& firstindex + 1& “:”& col& lastIndex的
                范围(Y).Clear
            结束如果'如果过滤器范围内的第一个和第二个单元格相同则清除第二个单元             如果范围(col&“1:”& col&“1”)。值=范围(col&“2:”& col&“2”)。值然后
                范围(col&“2:”& col&“2”)。清除
            结束如果
        结束如果
    下一个名字
    AutoFilter = False'关闭过滤功能
    ActiveSheet.ShowAllData'并显示数据
结束子

1 个答案:

答案 0 :(得分:0)

这似乎相当快:

Sub Delete_Dupes()
    Dim rng As Range, col As Range, c As Range
    Dim x As Long, rngClear As Range

    Set rng = Selection
    For Each col In rng.Columns
        For x = col.Cells.Count To 2 Step -1
            Set c = col.Cells(x)
            If c.Value = c.offset(-1,0).Value Then
                If Not rngClear Is Nothing Then
                    Set rngClear = Application.Union(rngClear, c)
                Else
                    Set rngClear = c
                End If
            End If
        Next x
    Next col

    If Not rngClear Is Nothing Then rngClear.Clear

End Sub