我有一个工作表,其中包含已经过滤的~8,000行。我试图从工作表列中获取一组值,并删除重复项。通过这里的帖子阅读有两种方法可以做到这一点。如果新集合中尚不存在该值,则循环访问集合并复制到新集合 或者将列中的数据复制到临时电子表格中,过滤并将数据复制到另一列,然后将其添加到集合中。
当处理大量数据时,复制过滤器具有最佳性能,但由于必须创建新工作表,因此很笨重。
我还没有看到它,但有没有办法在内存中进行复制过滤而不是创建工作表来执行此操作?
复现:
Sub GetColumnValues(Ws As Worksheet, Column As Long, CollValues As Collection)
Dim RowIndex As Long
For RowIndex = 1 To GetLastRow(Ws)
If CollValues.Count = 0 Then
CollValues.Add (Ws.Cells(RowIndex, Column).Value)
Else
If IsInCollection(CollValues, Ws.Cells(RowIndex, Column).Value) = False Then
CollValues.Add (Ws.Cells(RowIndex, Column).Value)
End If
End If
Next RowIndex
End Sub
过滤和复制
Sub GetColumnValues(Ws As Worksheet, Column As Long, CollValues As Collection)
Dim rowLast As Long
Dim c As Range
Dim tmpWS As Worksheet
Dim tmpWsName As String
tmpWsName = "TempWS"
Call DeleteWs(TsWb, tmpWsName)
Set tmpWS = TsWb.Sheets.Add()
tmpWS.Name = tmpWsName
rowLast = GetLastRow(Ws)
Ws.Range(Ws.Cells(1, Column), Ws.Cells(rowLast, Column)).Copy
tmpWS.Range("A1").PasteSpecial
rowLast = GetLastRow(tmpWS)
tmpWS.Range(tmpWS.Cells(1, 1), tmpWS.Cells(rowLast, 1)).AdvancedFilter _
Action:=xlFilterCopy, _
CopyToRange:=tmpWS.Range("B1"), _
Unique:=True
rowLast = GetLastRow(tmpWS)
For Each c In tmpWS.Range(tmpWS.Cells(1, 2), tmpWS.Cells(rowLast, 2))
If Len(c.value) > 0 Then
CollValues.Add (c.value)
End If
Next c
Call DeleteWs(TsWb, tmpWsName)
End Sub
答案 0 :(得分:0)
是的,只需创建数组,然后检查数组,然后将结果发送回工作表。就个人而言,我喜欢在内存中而不是通过Application IDE。
速度更快(特别是成千上万行),您不必担心屏幕刷新,或让您的用户想知道一切都在快速移动的情况。我通常处理内存中的所有内容,将其移回,然后激活我希望用户看到的工作表。
dim set1Array() as String
dim set2Array() as String
dim set1Rows as Long
dim set2Rows as Long
dim lngX as Long
dim lngY as Long
dim blnDebug as Boolean; blnDebug = true ' flag for debugging
' get count of rows so we know how big to make the arrays
set1Rows = GetLastRow(Ws1)
set2Rows = GetLastRow(Ws2)
' set arrays to the proper size
redim set1Rows(set1Rows - 1, 1)' 1 represents 2 columns since it's 0 based. the second column is a flag for duplicated.
redim set2Rows(set2Rows - 1, 0)' 0 represents 1 column since it's 0 based
' load the arrays with the sheet data
for lngX = 1 to set1Rows
set1Rows(lngX - 1, 0) = Worksheets("Sheet1").range("A" & lngX).Text
next lngX
for lngX = 1 to set2Rows
set2Rows(lngX - 1, 0) = Worksheets("Sheet2").range("A" & lngX).Text
next lngX
' I like to do a debug callout here to see what I got to make sure that I am good to go with the dataset
if blnDebug then
for lngX = 0 to Ubound(set1Rows)
debug.print "set1Rows(" & lngX & ") - col1: " & set1Rows(lngX, 0)
next lngX
for lngX = 0 to Ubound(set2Rows)
debug.print "set2Rows(" & lngX & ") - col1: " & set2Rows(lngX, 0)
next lngX
end if
' now do your comparison
for lngX = 0 to Ubound(set1Rows)
for lngY = 0 to Ubound(set2Rows)
if set1Rows(lngX, 0) = set2Rows(lngY, 0) then
set1Rows(lngX, 1) = "1"
end if
next lngY
next lngX
' now your duplicates are flagged in the set1Rows array
for lngX = 0 to Ubound(set1Rows)
if set1Rows(lngX, 1) = "1" then
' code for duplicated
else
' code for unique
end if
next lngX
答案 1 :(得分:0)
我不知道为什么它必须是一个集合,但为了快速获得一个没有双倍(过滤列表)的所有值的数组,你可以这样做:(非常接近你的第一个例子)< / p>
Function GetColVal(Ws As Worksheet, Column As Long) As Variant
Dim runner As Variant, outputVal() As Variant, i As Long
ReDim outputVal(Ws.Range(Ws.Cells(1, Column), Ws.Cells(GetLastRow(Ws), Column)).SpecialCells(xlCellTypeVisible).Count)
For Each runner In Ws.Range(Ws.Cells(1, Column), Ws.Cells(GetLastRow(Ws), Column)).SpecialCells(xlCellTypeVisible)
If i = 0 Then
outputVal(0) = runner.Value: i = 1
Else
If IsError(Application.Match(runner.Value, outputVal, 0)) Then outputVal(i) = runner.Value: i = i + 1
End If
Next
ReDim Preserve outputVal(i - 1)
GetColVal= outputVal
End Function
Application.Match
是VBA中最快的函数之一,而IsInCollection
可能非常慢...更好地运行For Each ...
循环来添加集合中的所有内容而不是检查集合。 ..
Dim a As Variant
For Each a in GetColVal(Worksheets("SheetX"),7)
MyCollection.Add a
Next
应该比你的例子快得多......我仍然建议不要使用集合,特别是如果你只是使用值...如果可能的话,最好直接使用GetColVal
- 数组...
variantVariable = GetColVal(Worksheets("SheetX"),7)
然后将变量变量用于您想要做的任何事情(您也可以将其直接粘贴到工作表中的某个位置)
Dim a As Variant
a = GetColVal(Worksheets("Sheet1"),13) 'values from sheet1 column M
'pasting in one row (starting at A1 in Sheet2)
ThisWorkbook.Sheets("Sheet2").Range(Cells(1, 1), Cells(1, ubound(a) + 1)).value = a
'pasting in one column (starting at C5 in Sheet4)
ThisWorkbook.Sheets("Sheet4").Range(Cells(5, 3), Cells(ubound(a) + 5, 3)).value = Application.Transpose(a)
修改强>
展示不同的东西:
Function GetColumnValues(ws As Worksheet, Column As Long) As Range
With ws
Dim srcRng As Range, outRng As Range, runRng1 As Range, runRng2 As Range, dBool As Boolean
Set srcRng = .Range(.Cells(1, Column), .Cells(GetLastRow(ws), Column)).SpecialCells(xlCellTypeVisible)
For Each runRng1 In a
If outRng Is Nothing Then Set outRng = runRng1
For Each runRng2 In outRng
If Intersect(runRng1, runRng2) Is Nothing Then
If runRng2.Value = runRng1.Value Then dBool = True: Exit For
End If
Next
If dBool Then dBool = False Else Set outRng = Union(outRng, runRng1)
Next
End With
Set GetColumnValues = outRng
End Function
使用此功能,您将获得一系列可以选择或复制到另一个位置的单元格(使用格式化和其他所有单元格)。您仍然可以使用For Each ...
将所有元素添加到集合中。我也没有使用Match
来避免&#34; Len&gt; 255&#34; -error