VBA删除更快地复制代码

时间:2017-12-06 02:23:15

标签: excel vba excel-vba

目前使用此代码,我有一大堆数据,而且运行速度非常慢。我需要删除任何重复的信息,并保留最高的信息行。

dim dup as variant, r as long, lncheckduplicatescolumn as long
With wb_DST.Sheets(sWs_DST)
     lncheckduplicatescolumn = .Cells(.Rows.Count, "A").End(xlUp).row
     for r = lncheckduplicatescolumn  to 2 step -1
         dup = application.match(.cells(r, "A").value, .columns(1), 0)
             if dup < r then .rows(dup).delete
     next r
end with

数据:

       Column A              Column B
          A                      1
          B                      2
          C                      3
          A                      3

结果应该是:

           B                     2
           C                     3
           A                     3

A列中的数据顺序无关紧要,只要它是唯一的,并保留较高行号中的信息。虽然我共享的代码有效,但对于大型数据集来说它太慢了。

2 个答案:

答案 0 :(得分:3)

另一种快速方法是使用Dictionary对象。您可以检查列{A}中的任何值是否已存在于Dictionary中。如果他们这样做(意味着它是重复的),那么不要每次都删除它们,这为代码的运行时间增加了很长时间。相反,您可以使用DelRng对象,Range使用Union来合并多个重复的行。

稍后,您可以使用DelRng.Delete一次删除整个ducplicates范围。

<强> 代码

Option Explicit

Sub RemoveDuplicatesUsingDict()

Dim wb_DST As Workbook
Dim sWs_DST As String

'  Dictionary variables
Dim Dict As Object
Dim DictIndex As Long, ExistIndex As Long

Dim DelRng As Range
Dim LastRow As Long, i As Long

' --- parameters for my internal testing ---
Set wb_DST = ThisWorkbook
sWs_DST = "Sheet1"

Application.ScreenUpdating = False

Set Dict = CreateObject("Scripting.Dictionary")

With wb_DST.Sheets(sWs_DST)
     LastRow = .Cells(.Rows.Count, "A").End(xlUp).Row ' get last row with data in column "A"

     For i = LastRow To 2 Step -1
        If Not Dict.exists(.Range("A" & i).Value) Then ' value doesn't exists yet in Dictionary >> add this Key
            Dict.Add .Range("A" & i).Value, .Range("A" & i).Value
        Else ' value already exists in Dictionary >> add it to DelRng (later will delete the entire range)
            If Not DelRng Is Nothing Then
                Set DelRng = Application.Union(DelRng, .Rows(i)) ' add current row to existing DelRng
            Else
                Set DelRng = .Rows(i)
            End If
        End If
     Next i
End With

' delete the entire range at 1-shot
If Not DelRng Is Nothing Then DelRng.Delete

Application.ScreenUpdating = True

End Sub

答案 1 :(得分:0)

快速使用数据字段数组

在一个范围内循环并不快 - 如果你用搜索数据创建一个数据字段数组,你可以加速它(array =#34; A&#34; - 参见1)并在其中循环。如果您的数据集增长,与上面显示的字典方法相比,这会变得更快,尽管它依赖于一种良好且可靠的方法。

搜索方法

  • 根据已找到唯一值的连接搜索字符串检查任何数组值,如果尚未包含,则添加(参见2)
  • 完成的字符串转换为数组并写回给定的目标列(例如&#34; H&#34;) - 参见3)和4)

我甚至添加了第二列以及相应的行号,因此您应该可以使用它们进行进一步操作。您也可以将结果写入另一张表。

代码 - 方法演示

Sub RemoveDuplicates()
Dim t    As Double: t = Timer   ' stop watch
Dim ws   As Worksheet           ' source sheet (object)
Dim i    As Long                ' row index
Dim a, arr, arr2                ' variant
Dim s    As String, si As String
Const SEP = "|"                 ' delimiter
s = SEP: si = SEP
' 0) fully qualified range reference to source sheet
  Set ws = ThisWorkbook.Worksheets("Sheet1")
' 1) write column A data to one based 2-dim data field array
  a = ws.Range("A1:A" & ws.Cells(ws.Rows.Count, "A").End(xlUp).Row)
' 2) loop through data and check for duplicates in concatenated string s
  For i = 2 To UBound(a)
     If InStr(s, SEP & a(i, 1) & SEP) = 0 Then
        If Len(a(i, 1)) > 0 Then
           s = s & a(i, 1) & SEP
           si = si & i & SEP
        End If
     End If
  Next i
' 3) transform unique values to zero based 1-dim array
  arr = Split(Mid(s, 2), SEP)          ' duplicates string to array
  arr2 = Split(Mid(si, 2), SEP)        ' found row numbers
' 4) write result to column H2:H...      ' <<< change target to wanted column
  ws.Range("H:H").ClearContents         '
  ws.Range("H2:H" & (2 + UBound(arr))).Value = Application.Transpose(arr)
  ws.Range("I2:I" & (2 + UBound(arr2))).Value = Application.Transpose(arr2)
  Debug.Print UBound(arr) + 0 & " unique items found", Format(Timer - t, "0.00 seconds needed")

End Sub

=============================================== ==================

修改

版本2 - 包括使用唯一值覆盖原始数据

在这里,您会发现一个略有修改的版本,用35个列(A2:AI ..)覆盖具有唯一值的原始数据。

Sub RemoveDuplicates2()
' Edit: overwrite original data A2:AI{..} with unique values
Dim t    As Double: t = Timer   ' stop watch
Dim ws   As Worksheet           ' source sheet (object)
Dim i    As Long                ' row index
Dim a, arr, arr2                ' variant
Dim s    As String, si As String
Const SEP = "|"                 ' delimiter
Const MyLastColumn = "AI"       ' letter of last column (no 35) = "AI"
s = SEP: si = SEP
' fully qualified range reference to source sheet
  Set ws = ThisWorkbook.Worksheets("Sheet1")
' write column A data to one based 2-dim data field array
  a = ws.Range("A1:A" & ws.Cells(ws.Rows.Count, "A").End(xlUp).Row)
' loop through data and check for duplicates in concatenated string s
  For i = 2 To UBound(a)        '  For i = UBound(a) To 2 Step -1
     If InStr(s, SEP & a(i, 1) & SEP) = 0 Then
        If Len(Trim(a(i, 1))) > 0 Then
           s = s & a(i, 1) & SEP
           si = si & i & SEP
        End If
     End If
  Next i
' write unique values to zero based 1-dim array (starts with index 0; last delimiter removed in this version)
  arr2 = Split(Mid(si, 2, Len(si) - 2), SEP)     ' found row numbers
' overwrite original data      
  For i = LBound(arr2) To UBound(arr2)  ' starts with index 0!
      s = "A" & arr2(i) & ":" & MyLastColumn & arr2(i)
      arr = ws.Range(s)        ' create 1-based 1-line (2-dim) array
      s = "A" & i + 2 & ":" & MyLastColumn & i + 2   ' 0 + 2 = +2 ... start in row 2
      ws.Range(s) = arr        ' write back unique row values
  Next i
  s = "A" & UBound(arr2) + 3 & ":" & MyLastColumn & UBound(a) + 1
  ws.Range(s).ClearContents     ' clear rest of original data
  Debug.Print UBound(arr2) + 1 & " unique items found", Format(Timer - t, "0.00 seconds needed")          ' result
End Sub