当我偶然发现another question of mine时,我正在研究this helpful question and answer.的解决方案。但是,当我退出该功能时,实现Control Freak在那里给出的答案会抛出$var_1 = $_REQUEST['var_1']; //REQUEST, GET or POST
错误并返回我的代码:Type Mismatch
。我不是那么熟练的程序员来弄清楚这里出了什么问题,所以任何人都可以对此有所了解。
这是我的代码:
Years = ReDimPreserve(Years, i, 3)
以下是Control Freak编写的函数:
Sub DevideData()
Dim i As Integer
Dim Years() As String
ReDim Years(1, 3)
Years(1, 1) = Cells(2, 1).Value
Years(1, 2) = 2
i = 2
ThisWorkbook.Worksheets("Simple Boundary").Activate
TotalRows = ThisWorkbook.Worksheets("Simple Boundary").Range("A100000").End(xlUp).row
For row = 3 To TotalRows
Years = ReDimPreserve(Years, i, 3)
If Not Cells(row, 1).Value = Cells(row - 1, 1).Value Then
Years(i - 1, 3) = row - 1
Years(i, 1) = Cells(row, 1).Value
Years(i, 2) = row
i = i + 1
End If
Next row
End Sub
答案 0 :(得分:2)
我答应了更全面的答案。对不起,这比我预期的要晚:
正如我在第一次评论中所说:
Public Function ReDimPreserve(aArrayToPreserve, nNewFirstUBound, nNewLastUBound)
导致aArrayToPreserve
具有默认类型Variant。这不匹配:
Dim Years() As String
正如您所发现的那样,将“岁月”重新定义为“变体”,可以解决问题。另一种方法是修改ReDimPreserve
的声明,因此aArrayToPreserve
是一个String类型的数组。我不推荐这种方法,因为你在数组中存储字符串和数字。 Variant数组将处理字符串或数字,而String数组只能通过将数字转换为字符串进行存储并返回数字进行处理来处理数字。
我尝试使用不同数量的数据和不同的修改宏来定时运行:
Rows of data Amendment Duration of run
3,500 Years() changed to Variant 4.99 seconds
35,000 Years() changed to Variant 502 seconds
35,000 aArrayToPreserve changed to String 656 seconds
正如我在第二条评论中所说,ReDim Preserve
对于内置方法和您找到的VBA例程都很慢。每次通话都必须:
ReDim Preserve
是一种非常有用的方法,但必须非常谨慎地使用它。有时我发现在开始时将数组的大小调整到最大值并使用ReDim Preserve将数组减少到最后使用的大小是一种更好的技术。下面显示的最佳技术确定了在调整数组大小之前所需的条目数。
在日常工作的最底层,我补充道:
For i = LBound(Years, 1) To LBound(Years, 1) + 9
Debug.Print Years(i, 0) & "|" & Years(i, 1) & "|" & Years(i, 2) & "|" & Years(i, 3)
Next
For i = UBound(Years, 1) - 9 To UBound(Years, 1)
Debug.Print Years(i, 0) & "|" & Years(i, 1) & "|" & Years(i, 2) & "|" & Years(i, 3)
Next
这导致以下内容输出到立即窗口:
|||
|AAAA|2|2
|AAAB|3|4
|AAAC|5|7
|AAAD|8|11
|AAAE|12|16
|AAAF|17|22
|AAAG|23|23
|AAAH|24|25
|AAAI|26|28
|AOUJ|34973|34976
|AOUK|34977|34981
|AOUL|34982|34987
|AOUM|34988|34988
|AOUN|34989|34990
|AOUO|34991|34993
|AOUP|34994|34997
|AOUQ|34998|35002
|AOUR|35003|
|||
由于您调用了数组Years
,我怀疑我的字符串值与您的字符串类似。这没关系。重要的是,我怀疑这个输出正是你想要的。
如果你写:
ReDim Years(1, 3)
下限设置为Option Base
语句指定的值,如果没有Option Base
语句则设置为零。您有两个不使用的维度的下限。这就是顶部“|||”的原因。最后还有另一个“|||”表示您正在创建一个您不使用的最后一行。最后使用的行没有一个我认为是错误的结束行。
当我可以将例程分成几个步骤时,我总是在前进到下一步之前验证一步的结果。这样,我知道任何问题都在当前步骤内,而不是早期步骤中的错误结果。我使用Debug.Print
大部分时间输出到立即窗口。只有当我想输出大量诊断信息时,才会写入文本文件。无论哪种方式,像我这样的代码块都是快速调试宏的重要帮助。
我永远不会写ReDim Years(1, 3)
。我总是指定下限,以便绝对清楚。 VBA是我所知道的唯一语言,您可以在其中为下限指定任何值(假设它小于上限),因此如果对特定问题有帮助,我将指定非标准值。在这种情况下,我认为除了一个以外的下限没有优势,所以这就是我所使用的。
对于两个维度数组,通常将列作为第一维,将行作为第二维。一个例外是读取或写入工作表的数组,其中维度是相反的。您将行作为第一个维度。如果您使用了传统序列,则可以使用ReDim Preserve
方法,从而避免使用RedimPreserve
函数和不匹配类型的问题。
技术1
我希望这是最快的技术。专家建议我们避免“重新发明轮子”。也就是说,如果Excel具有可以执行所需操作的例程,请不要在VBA中编写替代代码。但是,我发现了许多不成立的例子,我发现这种技术就是其中之一。
这里显而易见的技巧是使用Filter
,然后使用SpecialCells
创建一系列可见行,最后处理此范围内的每一行。我已成功地使用这种技术来满足其他要求,但不是在这里。
我不知道VBA选择了唯一的行,因此启动了宏录制器并从键盘中过滤了我的测试数据以获得:
Range("A1:A35000").AdvancedFilter Action:=xlFilterInPlace, Unique:=True
我过去使用Filter
已全部转换为AutoFilter,我发现它可以提供可接受的性能。这转换为AdvancedFilter
,从键盘和VBA都需要20秒。我不知道为什么这么慢。
第二个问题是:
Set RngUnique = .Range(.Cells(1, 1), .Cells(RowLast, 1)) _
.SpecialCells(xlCellTypeVisible)
被拒绝为“过于复杂”。
无法将可见行作为范围获取意味着Filter
的好处并非真正可用。我已计算可见行以模拟RngUnique.Rows.Count
。这显示了始终与AutoFilter
一起使用的技术。如果AdvancedFilter
在接受的时间内报告了唯一的行,我可能已经调查了这个问题,但在这种情况下似乎不值得努力。
展示这种技术的宏是:
Option Explicit
Sub Technique1()
' * Avoid using meaningless names like i. Giving every variable a meaningful
' name is helpful during development and even more helpful when you return
' to the macro in six months for maintenence.
' * My naming convention is use a sequence of keywords. The first keyword
' identifies what type of data the variable holds. So "Row" means it holds
' a row number. Each subsequent keyword narrows the scope. "RowSb" is a
' row of the worksheet "Simple Boundary" and "RowYears" is a row of the Years
' array. "RowSbCrnt"is the current row of the worksheet "Simple Boundary".
' * I can look at macros I wrote years ago and know what all the variables are.
' You may not like my convention. Fine, development your own but do not
' try programming with random names.
' * Avoid data type Integer which specifies a 16-bit whole number and requires
' special processing on 32 and 64-bit computers. Long is now the recommended
' data type for whole numbers.
Dim NumRowsVisible As Long
Dim RowSbCrnt As Long
Dim RowSbLast As Long
Dim RowYearsCrnt As Long
Dim TimeStart As Double
Dim Years() As Variant
TimeStart = Timer ' Get the time as seconds since midnight to nearest .001
' of a second
' This can save significant amounts of time if the macro amends the
' screen or switches between workbooks.
Application.ScreenUpdating = False
With Worksheets("Simple Boundary")
' Rows.Count avoiding having to guess how many rows will be used
RowSbLast = .Cells(Rows.Count, "A").End(xlUp).Row
' Hide non-unique rows
With .Range(.Cells(1, 1), .Cells(RowSbLast, 1))
.AdvancedFilter Action:=xlFilterInPlace, Unique:=True
End With
' Count number of unique rows.
' It is difficult to time small pieces of code because OS routines
' can execute at any time. However, this count takes less than .5
' of a second with 35,000 rows.
NumRowsVisible = 0
For RowSbCrnt = 2 To RowSbLast
If Not .Rows(RowSbCrnt).Hidden Then
NumRowsVisible = NumRowsVisible + 1
End If
Next
' Use count to ReDim array to final size.
ReDim Years(1 To 3, 1 To NumRowsVisible)
RowYearsCrnt = 1
Years(1, RowYearsCrnt) = .Cells(2, 1).Value
Years(2, RowYearsCrnt) = 2
For RowSbCrnt = 3 To RowSbLast
If Not .Rows(RowSbCrnt).Hidden Then
Years(3, RowYearsCrnt) = RowSbCrnt - 1
RowYearsCrnt = RowYearsCrnt + 1
Years(1, RowYearsCrnt) = .Cells(RowSbCrnt, 1).Value
Years(2, RowYearsCrnt) = RowSbCrnt
End If
Next
' Record final row for final string
Years(3, RowYearsCrnt) = RowSbLast
.ShowAllData ' Clear AdvancedFilter
End With
Application.ScreenUpdating = True
Debug.Print "Duration: " & Format(Timer - TimeStart, "#,##0.000")
' Output diagnostics
For RowYearsCrnt = 1 To 9
Debug.Print Years(1, RowYearsCrnt) & "|" & _
Years(2, RowYearsCrnt) & "|" & _
Years(3, RowYearsCrnt) & "|"
Next
' Note that rows are now in the second dimension hence the 2 in UBound(Years, 2)
For RowYearsCrnt = UBound(Years, 2) - 9 To UBound(Years, 2)
Debug.Print Years(1, RowYearsCrnt) & "|" & _
Years(2, RowYearsCrnt) & "|" & _
Years(3, RowYearsCrnt) & "|"
Next
End Sub
立即窗口的输出是:
Duration: 20.570
AAAA|2|2|
AAAB|3|4|
AAAC|5|7|
AAAD|8|11|
AAAE|12|16|
AAAF|17|22|
AAAG|23|23|
AAAH|24|25|
AAAI|26|28|
AOUI|34970|34972|
AOUJ|34973|34976|
AOUK|34977|34981|
AOUL|34982|34987|
AOUM|34988|34988|
AOUN|34989|34990|
AOUO|34991|34993|
AOUP|34994|34997|
AOUQ|34998|35002|
AOUR|35003|35008|
如您所见,最后一行是正确的。持续时间为20秒优于技术的8分钟,但我相信我们可以做得更好。
技术2
下一个宏与上一个宏类似,但它计算唯一行,而不是使用AdvancedFilter来隐藏非唯一行。此宏的持续时间为1.5秒,包含35,000行。这表明在第一次传递数据时计算数组所需的行数是可行的方法。此宏的诊断输出与上述相同。
Sub Technique2()
Dim NumRowsUnique As Long
Dim RowSbCrnt As Long
Dim RowSbLast As Long
Dim RowYearsCrnt As Long
Dim TimeStart As Double
Dim Years() As Variant
TimeStart = Timer ' Get the time as seconds since midnight to nearest .001
' of a second
With Worksheets("Simple Boundary")
RowSbLast = .Cells(Rows.Count, "A").End(xlUp).Row
' Count number of unique rows.
' Assume all data rows are unique until find otherwise
NumRowsUnique = RowSbLast - 1
For RowSbCrnt = 3 To RowSbLast
If .Cells(RowSbCrnt, 1).Value = .Cells(RowSbCrnt - 1, 1).Value Then
NumRowsUnique = NumRowsUnique - 1
End If
Next
' * Use count to ReDim array to final size.
' * Note that I have defined the columns as the first dimension and rows
' as the second dimension to match convention. Had I wished, this would
' have allowed me to use the standard ReDim Preserve which can only
' adjust the last dimension. However, this does not match the
' syntax of Cells which has the row first. It may have been better to
' maintain your sequence so the two sequences were the same.
ReDim Years(1 To 3, 1 To NumRowsUnique)
RowYearsCrnt = 1
Years(1, RowYearsCrnt) = .Cells(2, 1).Value
Years(2, RowYearsCrnt) = 2
For RowSbCrnt = 3 To RowSbLast
If .Cells(RowSbCrnt, 1).Value <> .Cells(RowSbCrnt - 1, 1).Value Then
Years(3, RowYearsCrnt) = RowSbCrnt - 1
RowYearsCrnt = RowYearsCrnt + 1
Years(1, RowYearsCrnt) = .Cells(RowSbCrnt, 1).Value
Years(2, RowYearsCrnt) = RowSbCrnt
End If
Next
' Record final row for final string
Years(3, RowYearsCrnt) = RowSbLast
End With
Debug.Print "Duration: " & Format(Timer - TimeStart, "#,##0.000")
' Output diagnostics
For RowYearsCrnt = 1 To 9
Debug.Print Years(1, RowYearsCrnt) & "|" & _
Years(2, RowYearsCrnt) & "|" & _
Years(3, RowYearsCrnt) & "|"
Next
' Note that rows are now in the second dimension hence the 2 in UBound(Years, 2)
For RowYearsCrnt = UBound(Years, 2) - 9 To UBound(Years, 2)
Debug.Print Years(1, RowYearsCrnt) & "|" & _
Years(2, RowYearsCrnt) & "|" & _
Years(3, RowYearsCrnt) & "|"
Next
End Sub
技术3
下一个宏与上一个宏稍有不同。
首先,我已经用以下常量替换了用于识别工作表和数组中列数的文字:
Const ColYrEnd As Long = 3
根据我的命名惯例ColYrEnd
= Col 以及 Y ea r 数组保持范围结束因此:
Years(ColYrEnd, RowYearsCrnt) = RowCvCrnt - 1
instead of Years(3, RowYearsCrnt) = RowCvCrnt - 1
这对编译的代码没有任何影响,但使源代码更容易理解,因为您不必记住第1,2和3列的含义。更重要的是,如果您需要重新排列列,则更新常量是唯一需要进行的更改。如果你需要搜索一个长宏来代替每次使用2作为列号(而忽略任何其他2的使用),那么你就会知道为什么这很重要。
其次,我使用过:
ColValues = .Range(.Cells(1, ColSbYear), _
.Cells(RowSbLast, ColSbYear)).Value
将第1列导入数组。从工作表中读取值的代码现在从此数组中读取它们。数组访问比工作表访问快得多,因此这将运行时间从1.5秒减少到.07秒。
修订后的代码是:
Sub Technique3()
Const ColCvYear As Long = 1
Const ColSbYear As Long = 1
Const ColYrYear As Long = 1
Const ColYrStart As Long = 2
Const ColYrEnd As Long = 3
Const RowSbDataFirst As Long = 2
Const RowCvDataFirst As Long = 2
Dim ColValues As Variant
Dim NumRowsUnique As Long
Dim RowCvCrnt As Long
Dim RowSbCrnt As Long
Dim RowSbLast As Long
Dim RowYearsCrnt As Long
Dim TimeStart As Double
Dim Years() As Variant
TimeStart = Timer ' Get the time as seconds since midnight to nearest .001
' of a second
With Worksheets("Simple Boundary")
RowSbLast = .Cells(Rows.Count, ColSbYear).End(xlUp).Row
ColValues = .Range(.Cells(1, ColSbYear), _
.Cells(RowSbLast, ColSbYear)).Value
' * The above statement imports all the data from column 1 as a two dimensional
' array into a Variant. The Variant is then accessed as though it is an array.
' * The first dimension has one entry per row, the second dimension has on entry
' per column which is one in this case. Both dimensions will have a lower bound
' of one even if the first row or column loaded is not one.
End With
' Count number of unique rows.
' Assume all data rows are unique until find otherwise
NumRowsUnique = UBound(ColValues, 1) - 1
For RowCvCrnt = RowCvDataFirst + 1 To UBound(ColValues, 1)
If ColValues(RowCvCrnt, ColCvYear) = ColValues(RowCvCrnt - 1, ColCvYear) Then
NumRowsUnique = NumRowsUnique - 1
End If
Next
' I mentioned earlier that I was unsure if having rows and columns in the
' convention sequence was correct. I am even less sure here where array
' ColValues has been loaded from a worksheet and the rows and columns are
' not in the conventional sequence. ReDim Years(1 To 3, 1 To NumRowsUnique)
RowYearsCrnt = 1
Years(ColYrYear, RowYearsCrnt) = ColValues(RowCvDataFirst, ColCvYear)
Years(ColYrStart, RowYearsCrnt) = RowCvDataFirst
For RowCvCrnt = RowCvDataFirst + 1 To UBound(ColValues, 1)
If ColValues(RowCvCrnt, ColCvYear) <> ColValues(RowCvCrnt - 1, ColCvYear) Then
Years(ColYrEnd, RowYearsCrnt) = RowCvCrnt - 1
RowYearsCrnt = RowYearsCrnt + 1
Years(ColYrYear, RowYearsCrnt) = ColValues(RowCvCrnt, ColCvYear)
Years(ColYrStart, RowYearsCrnt) = RowCvCrnt
End If
Next
' Record final row for final string
Years(ColYrEnd, RowYearsCrnt) = UBound(ColValues, 1)
Debug.Print "Duration: " & Format(Timer - TimeStart, "#,##0.000")
' Output diagnostics
For RowYearsCrnt = 1 To 9
Debug.Print Years(ColYrYear, RowYearsCrnt) & "|" & _
Years(ColYrStart, RowYearsCrnt) & "|" & _
Years(ColYrEnd, RowYearsCrnt) & "|"
Next
' Note that rows are now in the second dimension hence the 2 in UBound(Years, 2)
For RowYearsCrnt = UBound(Years, 2) - 9 To UBound(Years, 2)
Debug.Print Years(ColYrYear, RowYearsCrnt) & "|" & _
Years(ColYrStart, RowYearsCrnt) & "|" & _
Years(ColYrEnd, RowYearsCrnt) & "|"
Next
End Sub
其他技术
我考虑引入其他技术,但我认为它们对此要求没用。此外,这个答案已经足够长了。我提供了很多让你思考的东西,而且更多的东西会超载。如上所述,我已将35,000行的运行时间从8分钟减少到20秒,再缩短到1.5秒到.07秒。
通过我的宏慢慢地工作。我希望我已经对每个人所做的事情提供了充分的解释。一旦你知道一个语句存在,通常很容易查找它,所以没有太多的语句解释。如有必要,请回答问题。
答案 1 :(得分:1)
如前面评论中所述,ReDim Preserve在处理大型数据集时是一项昂贵的调用,通常可以避免。以下是一些应根据需要执行的注释代码。在具有200,000行的数据集上进行测试,完成时间不到5秒。测试了1000行的数据集,完成时间不到0.1秒。
代码使用Collection从列A中获取唯一值,然后根据这些唯一值构建数组,并将结果输出到另一个工作表。在您的原始代码中,没有输出结果数组,所以我只是做了一些事情,您需要根据需要调整输出部分。
Sub tgr()
Dim ws As Worksheet
Dim rngYears As Range
Dim collUnqYears As Collection
Dim varYear As Variant
Dim arrAllYears() As Variant
Dim arrYearsData() As Variant
Dim YearsDataIndex As Long
Set ws = ActiveWorkbook.Sheets("Simple Boundary")
Set rngYears = ws.Range("A1", ws.Cells(Rows.Count, "A").End(xlUp))
If rngYears.Cells.Count < 2 Then Exit Sub 'No data
Set collUnqYears = New Collection
With rngYears
.CurrentRegion.Sort rngYears, xlAscending, Header:=xlYes 'Sort data by year in column A
arrAllYears = .Offset(1).Resize(.Rows.Count - 1).Value 'Put list of years in array for faster calculation
'Get count of unique years by entering them into a collection (forces uniqueness)
For Each varYear In arrAllYears
On Error Resume Next
collUnqYears.Add CStr(varYear), CStr(varYear)
On Error GoTo 0
Next varYear
'Ssize the arrYearsData array appropriately
ReDim arrYearsData(1 To collUnqYears.Count, 1 To 3)
'arrYearsData column 1 = Unique Year value
'arrYearsData column 2 = Start row for the year
'arrYearsData column 3 = End row for the year
'Loop through unique values and populate the arrYearsData array with desired information
For Each varYear In collUnqYears
YearsDataIndex = YearsDataIndex + 1
arrYearsData(YearsDataIndex, 1) = varYear 'Unique year
arrYearsData(YearsDataIndex, 2) = .Find(varYear, .Cells(1), , , , xlNext).Row 'Start Row
arrYearsData(YearsDataIndex, 3) = .Find(varYear, .Cells(1), , , , xlPrevious).Row 'End Row
Next varYear
End With
'Here is where you would output your results
'Your original code did not output results anywhere, so adjust sheet and start cell as necessary
With Sheets("Sheet2")
.UsedRange.Offset(1).ClearContents 'Clear previous result data
.Range("A2").Resize(UBound(arrYearsData, 1), UBound(arrYearsData, 2)).Value = arrYearsData
.Select 'This will show the output sheet so you can see the results
End With
End Sub
答案 2 :(得分:0)
正如您在评论中提到的,如果您要继续这种方式,您肯定需要在if语句中移动该redim:
If Not Cells(row, 1).Value = Cells(row - 1, 1).Value Then
Years = ReDimPreserve(Years, i, 3)
Years(i - 1, 3) = row - 1
Years(i, 1) = Cells(row, 1).Value
Years(i, 2) = row
i = i + 1
End If
我认为这种重新调整的多维数组对你来说太过分了。我有一些建议:
我注意到你使用2个值来表示范围的开始和范围的结束(年(i,2)是开始,年(i,3)是结束)。相反,为什么不使用实际范围?
创建一个名为startNode
的范围变量,当您找到范围的结尾时,创建一个Range
对象,与Range(startNode,endNode)
一样。
您的代码将如下所示:
Sub DevideData()
Dim firstCell As Range
Dim nextRange As Range
Set firstCell = Cells(2,1)
ThisWorkbook.Worksheets("Simple Boundary").Activate
TotalRows = ThisWorkbook.Worksheets("Simple Boundary").Range("A100000").End(xlUp).row
For row = 3 To TotalRows
If Not Cells(row, 1).Value = Cells(row - 1, 1).Value Then
Set nextRange = Range(firstCell, Cells(row-1,1))
Set firstCell = Cells(row,1)
End If
Next row
End Sub
现在您不需要存储3个值!只是一系列范围,您可以像这样重新编写:
Dim years() As Range
'Do Stuff'
ReDim Preserve years(1 to i)
set years(i) = nextRange
i = i + 1
请注意,创建ReDimPreserve
的唯一原因是您可以重新创建二维维度的二维数组(通常只能更改第二维)。使用一维阵列,您可以自由地重新安装,没有任何麻烦! :)
最后,我建议您使用for each
循环而不是常规for循环。它使您对循环的意图更加明确,这使您的代码更具可读性。
Dim firstCell as Range
Dim lastUniqueValue as Variant
Dim lastCell as Range
Dim iCell as Range
Set firstCell = Cells(3,1)
lastUniqueValue = firstCell.Value
Set lastCell = ThisWorkbook.Worksheets("Simple Boundary").Range("A100000").End(xlUp)
For Each iCell in Range(firstCell, lastCell)
If iCell.Value <> lastUniqueValue Then
lastUniqueValue = iCell.Value
'Do Stuff
End If
Next
希望这有帮助! :)