VBA查找区分大小写的DUPLICATE行而不是单元格并删除

时间:2019-10-25 13:34:51

标签: excel vba

希望从下面发布的此处修改一个绝妙的答案。但是,下面的答案仅比较第一行中要标记并删除的每一行中的值。

但是我想看看第一列是否相同,如果是,请检查所有其他列是否相同,然后在整行都存在的情况下对其进行标记。

 tried amending the 
       IF Not .Exists(v(i,1)) Then to 
       IF Not .Exists(v(i,1)) and IF Not .Exists(v(i,2)) Then

没有工作也尝试过

   IF Not .Exists(v(i,1)) Then
    IF Not .Exists(v(i,2)) Then

Sub RemoveDuplicateRows()

Dim data As Range
Set data = ThisWorkbook.Worksheets("Sheet3").UsedRange

Dim v As Variant, tags As Variant
v = data
ReDim tags(1 To UBound(v), 1 To 1)
tags(1, 1) = 0 'keep the header

Dim dict As Dictionary
Set dict = New Dictionary
dict.CompareMode = BinaryCompare

Dim i As Long
For i = LBound(v, 1) To UBound(v, 1)
    With dict
        If Not .Exists(v(i, 1 And 2)) Then 'v(i,1) comparing the values in the first column
              tags(i, 1) = i
            .Add Key:=v(i, 1), Item:=vbNullString
         End If
      End With
Next i

Dim rngTags As Range
Set rngTags = data.Columns(data.Columns.count + 1)
rngTags.Value = tags

Union(data, rngTags).Sort key1:=rngTags, Orientation:=xlTopToBottom, Header:=xlYes

Dim count As Long
count = rngTags.End(xlDown).Row

rngTags.EntireColumn.Delete
data.Resize(UBound(v, 1) - count + 1).Offset(count).EntireRow.Delete

End Sub

1 个答案:

答案 0 :(得分:0)

我第一次尝试解决方案是使用SQL语句仅返回DISTINCT行。

但是,由于不支持COLLATION,所以VBA中的SQL语句近似模拟区分大小写的行为将不会达到我想要的效率。

说,据我所知,VBA中唯一的替代方法是遍历数据集。

尝试使用以下子过程,并告诉我它如何运行:

代码:

Sub remove_duplicates(ByVal wk_sheet As Worksheet, ByVal rng As Range)

'   +----------------------------------------------------------+
'   | DESCRIPTION:                                             |
'   |   Removes all duplicate whole rows in a range.           |
'   |   Case sensitive.                                        |
'   |                                                          |
'   | VARIABLES:                                               |
'   |   wk_sheet = Worksheet object where our data is stored.  |
'   |   rng = Range object where our data is stored.           |
'   |   arr = array to store the matrix.                       |
'   |   a = variables to store rows for comparison.            |
'   |   b = variables to store rows for comparison.            |
'   |   dirrng = string to store the references of rows        |
'   |            to delete.                                    |
'   |   rngc1 = string storing first column reference of       |
'   |           range.                                         |
'   |   rngc2 = string storing last column reference of        |
'   |           range.                                         |
'   |                                                          |
'   +----------------------------------------------------------+

    Dim arr As Variant, a As Variant, b As Variant
    Dim dirrng As String, rngc1 As String, rngc2 As String

    With rng
        arr = .Value
        rngc1 = Split(Mid(.Cells(1, 1).Address, 2), "$")(0)
        rngc2 = Split(Mid(.Cells(1, .columns.Count).Address, 2), "$")(0)
    End With

    For i = 1 To UBound(arr)
        a = Join(Application.WorksheetFunction.Index(arr, i, 0), "|")
        For r = 1 To UBound(arr)
            If i <> r And _
            (dirrng = "" Or _
             Not InStr(1, dirrng, _
                       rngc1 & i & ":" & rngc2 & i, vbBinaryCompare) > 0) Then
                b = Join(Application.WorksheetFunction.Index(arr, r, 0), "|")
                If a = b Then
                    If Len(dirrng) > 0 Then
                        dirrng = dirrng & "," & rngc1 & r & ":" & rngc2 & r
                    Else
                        dirrng = rngc1 & r & ":" & rngc2 & r
                    End If
                End If
            End If
        Next r
    Next i

    'Deleting all rows at once is more efficient than deleting one at time
    If Len(dirrng) > 0 Then rng.Range(dirrng).Delete Shift:=xlUp

End Sub

测试:

我使用以下数据集进行了测试:

enter image description here

重要提示:请注意,这里我们有一个命名范围“ TABLE_CONTENT”,其中包含我们的数据字段。如果要使用整个表作为交互范围并保留标题,则必须修改代码的版本。

然后,我使用以下代码调用 remove_duplicates 子进程,并传递适当的参数:

Sub test()
    Call remove_duplicates(ActiveSheet, ActiveSheet.Range("TABLE_CONTENT"))
End Sub

结果:

enter image description here

希望有帮助。