VBA,从数组中删除重复项

时间:2012-08-08 17:39:38

标签: excel excel-vba vba

有人可以就如何解决以下问题给我一些指示:

假设我在Excel 2010中有一个数据块,100行乘3列。

C列包含一些重复项,比如从1,1,1 2,3,4,5,.....,97,98开始

使用VBA,我想删除duplcate行,根据C列,所以我留下了1,2,3,.....,97,98,即只有98行和3列。

我知道我可以在Excel 2010中单击一个按钮来执行此操作,但我想在VBA中执行此操作(因为我已经尝试了这个并且出于某种原因,随后对我的其余代码进行了干预并给出了错误的结果)。

此外,我想在数组中执行此操作,然后将结果粘贴到工作表上,而不是像Application.Worksheetfunction.countif(.....

这样的方法。

类似于:

Dim myarray() as Variant
myarray=cells(1,1).Currentregion.value

Dim a as Long

For a=1 to Ubound(myarray,1)

    'something here to 

Next a

8 个答案:

答案 0 :(得分:5)

I answered a similar question。这是我使用的代码:

Dim dict As Object
Dim rowCount As Long
Dim strVal As String

Set dict = CreateObject("Scripting.Dictionary")

rowCount = Sheet1.Range("A1").CurrentRegion.Rows.Count

'you can change the loop condition to iterate through the array rows instead
Do While rowCount > 1
  strVal = Sheet1.Cells(rowCount, 1).Value2

  If dict.exists(strVal) Then
    Sheet1.Rows(rowCount).EntireRow.Delete
  Else
    'if doing this with an array, then add code in the Else block
    ' to assign values from this row to the array of unique values
    dict.Add strVal, 0
  End If

  rowCount = rowCount - 1
Loop

Set dict = Nothing

如果要使用数组,则使用相同的条件(if / else)语句遍历元素。如果字典中不存在该项,则可以将其添加到字典中并将行值添加到另一个数组中。

老实说,我认为最有效的方法是调整你从宏录制器获得的代码。您可以在一行中执行上述功能:

    Sheet1.UsedRange.RemoveDuplicates Columns:=3, Header:=xlYes

答案 1 :(得分:3)

Function eliminateDuplicate(poArr As Variant) As Variant
Dim poArrNoDup()

    dupArrIndex = -1
For i = LBound(poArr) To UBound(poArr)
        dupBool = False

        For j = LBound(poArr) To i
            If poArr(i) = poArr(j) And Not i = j Then
                dupBool = True
            End If
        Next j

        If dupBool = False Then
            dupArrIndex = dupArrIndex + 1
            ReDim Preserve poArrNoDup(dupArrIndex)
            poArrNoDup(dupArrIndex) = poArr(i)
        End If
Next i

eliminateDuplicate = poArrNoDup
End Function

答案 2 :(得分:1)

通过在内部循环中添加Exit For可以轻松改善@RBILLC的答案:

Function eliminateDuplicate(poArr As Variant) As Variant
    Dim poArrNoDup()

    dupArrIndex = -1
    For i = LBound(poArr) To UBound(poArr)
        dupBool = False

        For j = LBound(poArr) To i
            If poArr(i) = poArr(j) And Not i = j Then
                dupBool = True
                Exit For
            End If
        Next j

        If dupBool = False Then
            dupArrIndex = dupArrIndex + 1
            ReDim Preserve poArrNoDup(dupArrIndex)
            poArrNoDup(dupArrIndex) = poArr(i)
        End If
    Next i

    eliminateDuplicate = poArrNoDup
End Function

答案 3 :(得分:1)

对@RBILLC和@ radoslav006答案的改进,此版本在删除了重复项的数组中搜索现有值,以便搜索较少的值以找到重复项。

Public Function RemoveDuplicatesFromArray(sourceArray As Variant)
    Dim duplicateFound As Boolean
    Dim arrayIndex As Integer, i As Integer, j As Integer
    Dim deduplicatedArray() As Variant
    
    arrayIndex = -1
    deduplicatedArray = Array(1)

    For i = LBound(sourceArray) To UBound(sourceArray)
        duplicateFound = False

        For j = LBound(deduplicatedArray) To UBound(deduplicatedArray)
            If sourceArray(i) = deduplicatedArray(j) Then
                duplicateFound = True
                Exit For
            End If
        Next j

        If duplicateFound = False Then
            arrayIndex = arrayIndex + 1
            ReDim Preserve deduplicatedArray(arrayIndex)
            deduplicatedArray(arrayIndex) = sourceArray(i)
        End If
    Next i

    RemoveDuplicatesFromArray = deduplicatedArray
End Function

答案 4 :(得分:0)

字典最多包含255个项目,因此如果您有更多值,则需要使用集合。不幸的是,Collection对象没有.Contains(a)或.Exists(a)方法,但是这个函数通过使用错误号很好地处理(伪造)它:

更正:字典没有这样的限制(感谢Zairja)。我可能一直在使用Integer迭代我的Dictionary。无论如何,这个函数允许你检查集合中是否存在项目,所以如果它对任何人都有用,我会留在这里:

CollContainsItem(col As Collection, val As Variant) As Boolean

Dim itm As Variant
On Error Resume Next

    itm = col.Item(val)
    CollContainsItem = Not (Err.Number = 5 Or Err.Number = 9)

On Error GoTo 0

End Function

因此,如果你确实需要一个Collection,你可能只需要替换

dict.Exists(strVal)

CollContainsItem(coll, strVal)

并替换

Set dict = CreateObject("Scripting.Dictionary")

Set coll = CreateObject("Scripting.Collection")

使用Zairja的其余代码。 (我实际上并没有尝试,但它应该很接近)

答案 5 :(得分:0)

我知道这很旧了,但是这里是我用来将重复值复制到另一个范围的方法,以便可以快速查看它们,以建立我从各种电子表格中站起来的数据库的数据完整性。要使该过程删除重复项,就像将dupRng行替换为Cell.Delete Shift:=xlToLeft之类的操作一样简单。

我还没有亲自测试过,但是应该可以。

Sub PartCompare()
    Dim partRng As Range, partArr() As Variant, i As Integer
    Dim Cell As Range, lrow As Integer

    lrow = ThisWorkbook.Worksheets("Sheet1").Cells(Rows.Count, 1).End(xlUp).Row
    i = 0

    Set partRng = ThisWorkbook.Worksheets("Sheet1").Range(Cells(1, 1), Cells(lrow, 1))

    For Each Cell In partRng.Cells
        ReDim Preserve partArr(i)
        partArr(i) = Cell.Value
        i = i + 1
    Next

    Dim dupRng As Range, j As Integer, x As Integer, c As Integer

    Set dupRng = ThisWorkbook.Worksheets("Sheet1").Range("D1")

    x = 0
    c = 1
    For Each Cell In partRng.Cells
        For j = c To UBound(partArr)
            If partArr(j) = Cell.Value Then
                dupRng.Offset(x, 0).Value = Cell.Value
                dupRng.Offset(x, 1).Value = Cell.Address()
                x = x + 1
                Exit For
            End If
        Next j
        c = c + 1
    Next Cell
End Sub

答案 6 :(得分:0)

简单功能,可从一维数组中删除重复项

Private Function DeDupeArray(vArray As Variant) As Variant
  Dim oDict As Object, i As Long
  Set oDict = CreateObject("Scripting.Dictionary")
  For i = LBound(vArray) To UBound(vArray)
    oDict(vArray(i)) = True
  Next
  DeDupeArray = oDict.keys()
End Function

答案 7 :(得分:0)

从数组中删除重复项(以及相关的行项目)

由于OP希望VBA解决方案接近RemoveDuplicates,所以我演示了一种使用►字典的数组方法,其本身并不是获取唯一项dict.keys),首次出现(dict.items)的相关行索引

利用►LeaveUniques函数的高级功能-c.f,它们可通过过程Application.Index()保留整个行数据。 Some peculiarities of the the Application.Index function

示例呼叫

Sub ExampleCall()
'[0]define range and assign data to 1-based 2-dim datafield
    With Sheet1                   ' << reference to your project's sheet Code(Name)
        Dim lastRow: lastRow = .Cells(.Rows.Count, "C").End(xlUp).Row
        Dim rng:  Set rng = .Range("C2:E" & lastRow)
    End With
    Dim data: data = rng        ' assign data to 2-dim datafield
'[1]get uniques (column 1) and remove duplicate rows
    LeaveUniques data           ' << call procedure LeaveUniques (c.f. RemoveDuplicates)
'[2]overwrite original range
    rng.Clear
    rng.Resize(UBound(data), UBound(data, 2)) = data
End Sub

过程LeaveUniques

Sub LeaveUniques(ByRef data As Variant, Optional ByVal colNum As Long = 1)
'Purpose: procedure removes duplicates of given column number in entire array
    data = Application.Index(data, uniqueRowIndices(data, colNum), nColIndices(UBound(data, 2)))
End Sub

帮助功能LeaveUniques

Function uniqueRowIndices(data, Optional ByVal colNum As Long = 1)
'Purpose: return data index numbers referring to uniques
'a) set late bound dictionary to memory
    Dim dict As Object
    Set dict = CreateObject("Scripting.Dictionary")
'b) slice e.g. first data column (colNum = 1)
    Dim colData
    colData = Application.Index(data, 0, colNum)
'c) fill dictionary with uniques referring to first occurencies
    Dim i As Long
    For i = 1 To UBound(colData)
        If Not dict.exists(dict(colData(i, 1))) Then dict(colData(i, 1)) = i
    Next
'd) return 2-dim array of valid unique 1-based index numbers
    uniqueRowIndices = Application.Transpose(dict.items)
End Function

Function nColIndices(ByVal n As Long)
'Purpose: return "flat" array of n column indices, e.g. for n = 3 ~> Array(1, 2, 3)
    nColIndices = Application.Transpose(Evaluate("row(1:" & n & ")"))
End Function