根据其出现次数删除重复项

时间:2015-11-29 16:01:30

标签: excel vba duplicates countif find-occurrences

我想检查某个列(W)是否有重复项(出现次数存储在另一列(AZ)中)而不是以这种方式删除所有行:

  • 在列中找到两次值 - 仅删除包含值的一行。
  • 在列中找到更多次的值 - 删除包含值的所有行。

我的代码工作得很好,但有时它并没有像应该的那样删除所有重复项。有什么改进的想法吗?

编辑:更新的代码非常好用,只是它总是错过一个副本并且不会删除它。

fin = ws.UsedRange.Rows.count

For i = 2 To fin
    ws.Range("AZ" & i).value = Application.WorksheetFunction.CountIf(ws.Range("W2:W" & fin), ws.Range("W" & i))
Next i

For j = fin To 2 Step -1
    If ws.Range("AZ" & j).value > 2 Then
        ws.Range("AZ" & j).EntireRow.Delete
        fin = ws.UsedRange.Rows.count
    ElseIf ws.Range("AZ" & j).value = 2 Then
        Set rng = Range("W:W").Find(Range("W" & j).value, , xlValues, xlWhole, , xlNext)
        rngRow = rng.Row
        If rngRow <> j Then
            ws.Range("AZ" & rngRow) = "1"
            ws.Range("AZ" & j).EntireRow.Delete
            fin = ws.UsedRange.Rows.count
        Else
            MsgBox "Error at row " & rngRow
        End If
    End If
Next j

3 个答案:

答案 0 :(得分:1)

如果速度是一个问题,这里的方法应该更快,因为它创建了一个要删除的行集合,然后删除它们。由于除了实际的行删除之外的所有内容都是在VBA中完成的,因此来回工作表的调用次数要少得多。

如内联评论中所述,可以加快例行程序。 如果它仍然太慢,根据工作表的大小,将整个工作表读入VBA数组可能是可行的;测试重复;将结果写回新数组并将其写入工作表。 (如果您的工作表太大,则此方法可能会耗尽内存。)

无论如何,我们既需要类模块,您必须重命名cPhrases,以及常规模块

班级单元

Option Explicit
Private pPhrase As String
Private pCount As Long
Private pRowNums As Collection

Public Property Get Phrase() As String
    Phrase = pPhrase
End Property
Public Property Let Phrase(Value As String)
    pPhrase = Value
End Property

Public Property Get Count() As Long
    Count = pCount
End Property
Public Property Let Count(Value As Long)
    pCount = Value
End Property

Public Property Get RowNums() As Collection
    Set RowNums = pRowNums
End Property
Public Function ADDRowNum(Value As Long)
    pRowNums.Add Value
End Function


Private Sub Class_Initialize()
    Set pRowNums = New Collection
End Sub

常规模块

Option Explicit
Sub RemoveDuplicateRows()
    Dim wsSrc As Worksheet
    Dim vSrc As Variant
    Dim CP As cPhrases, colP As Collection, colRowNums As Collection
    Dim I As Long, K As Long
    Dim R As Range

'Data worksheet
Set wsSrc = Worksheets("sheet1")

'Read original data into VBA array
With wsSrc
    vSrc = .Range(.Cells(1, "W"), .Cells(.Rows.Count, "W").End(xlUp))
End With

'Collect list of items, counts and row numbers to delete
'Collection object will --> error when trying to add
'  duplicate key.  Use that error to increment the count

Set colP = New Collection
On Error Resume Next
For I = 2 To UBound(vSrc, 1)
        Set CP = New cPhrases
        With CP
            .Phrase = vSrc(I, 1)
            .Count = 1
            .ADDRowNum I

            colP.Add CP, CStr(.Phrase)
            Select Case Err.Number
                Case 457 'duplicate
                    With colP(CStr(.Phrase))
                        .Count = .Count + 1
                        .ADDRowNum I
                    End With
                    Err.Clear
                Case Is <> 0 'some other error.  Stop to debug
                    Debug.Print "Error: " & Err.Number, Err.Description
                    Stop
            End Select
        End With
Next I
On Error GoTo 0

'Rows to be deleted
Set colRowNums = New Collection
For I = 1 To colP.Count
    With colP(I)
        Select Case .Count
            Case 2
                colRowNums.Add .RowNums(2)
            Case Is > 2
                For K = 1 To .RowNums.Count
                    colRowNums.Add .RowNums(K)
                Next K
        End Select
    End With
Next I

'Revers Sort the collection of Row Numbers
'For speed, if necessary, could use
'   faster sort routine
RevCollBubbleSort colRowNums

'Delete Rows
'For speed, could create Unions of up to 30 rows at a time
Application.ScreenUpdating = False
With wsSrc
For I = 1 To colRowNums.Count
   .Rows(colRowNums(I)).Delete
Next I
End With

Application.ScreenUpdating = True

End Sub

'Could use faster sort routine if necessary
Sub RevCollBubbleSort(TempCol As Collection)
    Dim I As Long
    Dim NoExchanges As Boolean

    ' Loop until no more "exchanges" are made.
    Do
        NoExchanges = True

        ' Loop through each element in the array.
        For I = 1 To TempCol.Count - 1

            ' If the element is less than the element
            ' following it, exchange the two elements.
            If TempCol(I) < TempCol(I + 1) Then
                NoExchanges = False
                TempCol.Add TempCol(I), after:=I + 1
                TempCol.Remove I
            End If
        Next I
    Loop While Not (NoExchanges)
End Sub

答案 1 :(得分:0)

不需要在第二部分使用那个效率低下的第二个循环,只需使用像这样的实时计数

var http = require('http');
var fs = require('fs');
http.createServer(function(request, response) {
response.writeHead(200, {'Content-Type': 'text/html'});

switch(request.url){

    case '/':
        template = "home.html";
        break;
    case '/nodejs':
        template = "node.html";
        break;
    default:
        template = "404.html";
        break;

}

fs.readfile('./templates/' + template , function(err, data) {

    response.write(data);
    response.end();
});

}).listen(3000, 'localhost');

答案 2 :(得分:0)

虽然你的逻辑基本上是合理的,但这种方法并不是最有效的。 AutoFilter Method可以快速删除大于2的所有计数,Range.RemoveDuplicates¹ method可以随后快速删除仍然包含W列中重复值的行之一。

meshes.position.z = 2;   

当您重写AZ列中的计数值时,您可能会将 3 计数重写为 2 等。

¹ Range.RemoveDuplicates method从下往上删除重复的行。