我正在处理来自大型机的字母数字数据。由于访问点的性质,GetString方法在webbrowser接口中用于从大型机中提取数据。我正在重构我的代码以及旧代码以使用数据结构而不仅仅是范围对象,因为对于大型数据集,范围对象代码需要更长的时间。
作为general optimization practice的一部分,我运行所有大型数据集宏,Application.ScreenUpdating = False
和Application.Calculation = xlCalculationManual
处于活动状态。为了计时,我在将计数器与状态栏结合使用后将QueryPerformanceCounter与DoEvents结合使用,以便它为我提供完成特定宏所需的时间。 QueryPerformanceCounter位于类模块中,在执行代码的domain logic / business logic时没有直接作用。
例如,我最近重构了代码,该代码从大型机屏幕中提取了大约10,000个字符串,并通过循环将它们放入工作表中。当重构为数据结构循环时,将字符串转移到数组中时,代码大约需要70秒。代码也更具可移植性,因为这些字符串可以很容易地移位/放置到字典中进行排序或者用于解析。因此,我将所有VBA代码从基于范围的代码切换到数据结构,这是我的问题的引入/背景。
我在分析项目中遇到了一些旧代码,它有一些从大型机中提取内容的有趣逻辑。从本质上讲,代码以这种布局形式从服务器中提取内容:
然后使用Worksheet / Cell逻辑作为框架在Excel工作表中将内容解析为此表单:
代码,没有登录/访问逻辑以及sans子例程声明,如下所示:
Sub AcquireData()
CurrentServerRow = 13
WhileLoopHolder = 1
If Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)) <> "" Then
NewWorksheetLine_Sub
End If
Do While WhileLoopHolder = 1
If CurrentSession.Screen.Getstring(CurrentServerRow, 9, 1) = "-" Then
If Trim(CurrentSession.Screen.Getstring(CurrentServerRow + 1, 15, 1)) <> "" Then
NewWorksheetLine_Sub
End If
ElseIf Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)) = "" Then
If Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14)) <> "" Then
Cells(WorksheetRow, ValueSets) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14))
ValueSets = ValueSets + 1
End If
Else
If CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1) = "" Then
Cells(WorksheetRow, WorksheetColumn) = "X"
Else
Cells(WorksheetRow, WorksheetColumn) = CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1)
End If
Cells(WorksheetRow, WorksheetColumn + 1) = CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)
Cells(WorksheetRow, WorksheetColumn + 2) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 17, 39))
Cells(WorksheetRow, ValueSets) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14))
WorksheetColumn = WorksheetColumn + 3
ValueSets = ValueSets + 1
End If
CurrentServerRow = CurrentServerRow + 1
If CurrentServerRow > 41 Then
WhileLoopHolder = 0
End If
Loop
End Sub
Sub NewWorksheetLine_Sub()
WorksheetRow = WorksheetRow + 1
WorksheetColumn = 1
ValueSets = 10
End Sub
此代码嵌套在另一个程序的循环中,从而拉出数千行并整齐地组织它们。它还需要数小时,浪费宝贵的时间,可用于分析从服务器获取的数据。我设法将基本代码重构为数据结构,并使用我的学习来重构其他代码。不幸的是,我错误地重构了这个特殊的代码,因为我无法正确地模仿业务逻辑。我的片段如下:
Sub AcquireData()
'This code refactors the data into a datastructure from a range object, but does not really capture the logic.
'Also, There is an error in attempting to insert a variant array into a collection/dictionary data structure.
CurrentServerRow = 13
ReDim SourceDataArray(10)
WhileLoopHolder = 1
If Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)) <> "" Then
NewWorksheetLine_Sub
End If
Do While WhileLoopHolder = 1
If CurrentSession.Screen.Getstring(CurrentServerRow, 9, 1) = "-" Then
If Trim(CurrentSession.Screen.Getstring(CurrentServerRow + 1, 15, 1)) <> "" Then
NewWorksheetLine_Sub
End If
ElseIf Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)) = "" Then
If Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14)) <> "" Then
ReDim Preserve SourceDataArray(ValueSets)
SourceDataArray(ValueSets) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14))
ValueSets = ValueSets + 1
ReDim Preserve SourceDataArray(ValueSets)
End If
Else
If CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1) = "" Then
ReDim Preserve SourceDataArray(WorkSheetColumn)
SourceDataArray(WorkSheetColumn) = "X"
Else
SourceDataArray(WorkSheetColumn) = CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1)
End If
SourceDataArray(WorkSheetColumn + 1) = CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)
SourceDataArray(WorkSheetColumn + 2) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 17, 39))
SourceDataArray(ValueSets) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14))
WorkSheetColumn = WorkSheetColumn + 3
ValueSets = ValueSets + 1
ReDim Preserve SourceDataArray(ValueSets)
End If
CurrentServerRow = CurrentServerRow + 1
If CurrentServerRow > 41 Then
WhileLoopHolder = 0
End If
Loop
End Sub
Sub NewWorksheetLine_Sub()
SourceIndexAsString = SourceCollectionIndex
SourceDataCollection.Add SourceDataArray(), SourceIndexAsString
SourceCollectionIndex = SourceCollectionIndex + 1
WorkSheetColumn = 1
ValueSets = 10
End Sub
我认为为了使用相同类型的“单元格”逻辑,我可能想要使用嵌套在数组中的数组,然后将其转置到工作表中。但是,到目前为止,我在过去几周内未能成功实施任何此类解决方案。此外,可能有一种将逻辑重构为数据结构形式的优良方法。但是,我一直无法确定如何成功。
总而言之,我的问题如下:我可以用什么方式将“基于单元”的逻辑转换为数据结构逻辑?这样做的最佳数据结构是什么?在这种特殊情况下,如何使用此业务逻辑实现数据结构逻辑的使用?
答案 0 :(得分:1)
ReDim Preserve
的某些用法似乎有问题。
If CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1) = "" Then
ReDim Preserve SourceDataArray(WorkSheetColumn)
SourceDataArray(WorkSheetColumn) = "X"
因此,如果WorksheetColumn
的值为1
,我们会将SourceDataArray
缩减为一个条目,并丢弃数组中较高位置的所有数据。
Else
SourceDataArray(WorkSheetColumn) = CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1)
End If
SourceDataArray(WorkSheetColumn + 1) = CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)
SourceDataArray(WorkSheetColumn + 2) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 17, 39))
现在我们可能会查看SourceDataArray
中不存在的条目(即当遵循上面的If
分支而不是Else
分支时),我们应该得到一个“下标超出范围“错误
ReDim Preserve
仅保留数组元素的数据,这些元素对新数组大小有意义。因此,如果我们有ReDim a(10)
然后有ReDim Preserve a(5)
(并假设数组从元素0开始 - 即没有Option Base 1
)那么a(5)
到a(9)
现在是无法访问,他们包含的数据丢失
答案 1 :(得分:1)
要将使用单元格引用的代码重构为数组,您需要使用 2维数组。
单元格引用是基于1的,因此您也应该坚持使用数组中的内容。
您可以使用Range.Value
属性
' Range to array
Dim a as Variant
a = Range("A1:J100").Value
将导致a
成为大小为1 To 100, 1 To 10
' Array to Range
Dim a(1 To 100, 1 To 10) as Variant
' populate a
' ...
' Put a into a range
Range("A1:J100").Value = a
这两个代码段产生相同的输出,但第二个很多
Dim r as Long, c as Long
For r = 1 To 1000
For c = 1 To 100
Cells(r, c) = r * c
Next c, r
Dim r as Long, c as Long
Dim a() as Variant
Redim a(1 To 1000, 1 To 100)
For r = 1 To 1000
For c = 1 To 100
a(r, c) = r * c
Next c, r
Range("A1:CV1000") = a
ReDim Preserve
是一项相对昂贵的操作,因此在块中ReDim
更快
而不是这个
Redim a(1 To 10, 1 To 1)
For 1 = 1 to 100000
Redim Preserve a(1 To 10, 1 To i)
a(i) = SomeValue
Next
改为
Redim a(1 To 10, 1 To 1000)
For 1 = 1 to 100000
If i > UBound(a) Then
Redim Preserve a(1 To 10, 1 To UBound(a) + 1000)
End If
a(i) = SomeValue
Next
Redim Preserve a (1 To 10, 1 To i - 1)
Redim Preserve
只能更改多维数组的最后一个维度。
例如,这是有效的
Redim a(1 to 100, 1 To 10)
Redim Preserve a(1 to 100, 1 To 20)
这不起作用
Redim a(1 to 100, 1 To 10)
Redim Preserve a(1 to 200, 1 To 20)
通常在处理表示范围的数组时,它的行数变化最大。这是一个问题,因为Range.Value
数组是(1 To Rows, 1 To Columns)
解决方法是实际确定数组(1 To Columns, 1 To Rows)
的尺寸。根据需要Redim
行数,然后Transpose
进入目标范围
Dim r As Long, c As Long
Dim a() As Variant
ReDim a(1 To 100, 1 To 200)
For r = 1 To 1000
For c = 1 To 100
If r > UBound(a, 2) Then
ReDim Preserve a(1 To UBound(a, 1), 1 To UBound(a, 2) + 200)
End If
a(c, r) = r * c
Next c, r
Range("A1:CV1000") = Application.Transpose(a)
如果您需要更改两个维度,要更改第一个维度,则需要创建所需大小的新数组,并将数据从旧数组复制到新数组。再次,像块一样重新整理以避免过多的redim's
最后一件事:你似乎没有Dim
你的变量(除非你刚刚将这部分留在了你的帖子之外)。我建议您使用Option Explicit
和Dim
所有变量。这有助于避免数据类型错误,并避免使用Variant
来处理所有事情。当你需要Variants
时很好,但是当你不需要时,其他数据类型通常会更快。
答案 2 :(得分:0)
一旦我花了几周时间将其他宏从基于范围的逻辑重构为抽象的数据结构逻辑,一旦我回到这个宏,答案就会打动我。如果我只是模仿范围逻辑以便更快地完成宏,那么我只需要填充数组,使其在转置后匹配范围。这意味着我不需要修剪数组或以任何方式操纵其形式 - 我只需要以数组形式填充数据结构,然后将其转置到电子表格中。一旦数组填满,我也可以替代使用数据。
以下是解决方案代码:
Sub AcquireData()
'The array 'MyArray' was dimensioned as a dynamic array in the declarations section at the top of the module.
'Redim the array to a big 2 dimensional array that fits the needs of the data/macro.
ReDim MyArray(1 To 20, 1 To 20000)
'From here on, simply mimic the logic of the range macro... [i]
CurrentServerRow = 13
WhileLoopHolder = 1
If Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)) <> "" Then
NewWorksheetLine_Sub
End If
Do While WhileLoopHolder = 1
If CurrentSession.Screen.Getstring(CurrentServerRow, 9, 1) = "-" Then
If Trim(CurrentSession.Screen.Getstring(CurrentServerRow + 1, 15, 1)) <> "" Then
NewWorksheetLine_Sub
End If
ElseIf Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)) = "" Then
If Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14)) <> "" Then
'[i] ... except, move the values into the array in Column, Row logic form.
MyArray(ValueSets, WorksheetRow) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14))
ValueSets = ValueSets + 1
End If
Else
If CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1) = "" Then
MyArray(WorksheetColumn, WorksheetRow) = "X"
Else
MyArray(WorksheetColumn, WorksheetRow) = CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1)
End If
MyArray(WorksheetColumn + 1, WorksheetRow) = CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)
MyArray(WorksheetColumn + 2, WorksheetRow) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 17, 39))
MyArray(ValueSets, WorksheetRow) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14))
WorksheetColumn = WorksheetColumn + 3
ValueSets = ValueSets + 1
End If
CurrentServerRow = CurrentServerRow + 1
If CurrentServerRow > 41 Then
WhileLoopHolder = 0
End If
Loop
ArrayToWorkSheet_Sub
End Sub
Sub NewWorksheetLine_Sub()
WorksheetRow = WorksheetRow + 1
WorksheetColumn = 1
ValueSets = 10
End Sub
'When finished with the loop, push the array to the worksheet, and transpose it to provide the correct column to row relationship in the spreadsheet.
Sub ArrayToWorkSheet_Sub()
Dim ArrayLimit As Long
Dim LastCell As Long
Dim MyRange As Range
'This level of precision in setting the range appears unnecessary, but in theory I think it could speed up tranposing the array - [ii]
'[ii]but that is just speculation. Performance improvements for the tranposition appear to be minor, perhaps due to the fact that [iii]
'[iii]most - if not nearly all - of the intense computations occur earlier.
With Sheets("Sheet2")
ArrayLimit = UBound(MyArray, 2)
LastCell = ArrayLimit + 1
Set MyRange = .Range("A2:S" & LastCell)
MyRange = WorksheetFunction.Transpose(MyArray)
End With
End Sub
虽然Application.ScreenUpdating = False
和Application.Calculation = xlCalculationManual
在减少宏运行时都是非常宝贵的,但我在将这两行与抽象数据结构结合使用方面有非常积极的经验。在某些情况下,似乎数据结构似乎有助于优化性能,尤其是在宏过程中涉及大量逐行数据提取的情况下。