Excel单元格/范围逻辑作为数组逻辑

时间:2012-09-06 18:21:06

标签: data-structures excel-vba excel-2007 vba excel

我正在处理来自大型机的字母数字数据。由于访问点的性质,GetString方法在webbrowser接口中用于从大型机中提取数据。我正在重构我的代码以及旧代码以使用数据结构而不仅仅是范围对象,因为对于大型数据集,范围对象代码需要更长的时间。

作为general optimization practice的一部分,我运行所有大型数据集宏,Application.ScreenUpdating = FalseApplication.Calculation = xlCalculationManual处于活动状态。为了计时,我在将计数器与状态栏结合使用后将QueryPerformanceCounter与DoEvents结合使用,以便它为我提供完成特定宏所需的时间。 QueryPerformanceCounter位于类模块中,在执行代码的domain logic / business logic时没有直接作用。

例如,我最近重构了代码,该代码从大型机屏幕中提取了大约10,000个字符串,并通过循环将它们放入工作表中。当重构为数据结构循环时,将字符串转移到数组中时,代码大约需要70秒。代码也更具可移植性,因为这些字符串可以很容易地移位/放置到字典中进行排序或者用于解析。因此,我将所有VBA代码从基于范围的代码切换到数据结构,这是我的问题的引入/背景。

我在分析项目中遇到了一些旧代码,它有一些从大型机中提取内容的有趣逻辑。从本质上讲,代码以这种布局形式从服务器中提取内容:

Raw Data Pulled From Server Into Excel Sheet

然后使用Worksheet / Cell逻辑作为框架在Excel工作表中将内容解析为此表单:

Data Parsed from Server into Excel Sheet

代码,没有登录/访问逻辑以及sans子例程声明,如下所示:

Sub AcquireData()

    CurrentServerRow = 13

    WhileLoopHolder = 1

    If Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)) <> "" Then

        NewWorksheetLine_Sub

    End If

    Do While WhileLoopHolder = 1

        If CurrentSession.Screen.Getstring(CurrentServerRow, 9, 1) = "-" Then

            If Trim(CurrentSession.Screen.Getstring(CurrentServerRow + 1, 15, 1)) <> "" Then

                NewWorksheetLine_Sub

            End If

        ElseIf Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)) = "" Then

            If Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14)) <> "" Then
                Cells(WorksheetRow, ValueSets) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14))
                ValueSets = ValueSets + 1
            End If

        Else

            If CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1) = "" Then

                Cells(WorksheetRow, WorksheetColumn) = "X"

            Else

                Cells(WorksheetRow, WorksheetColumn) = CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1)

            End If

            Cells(WorksheetRow, WorksheetColumn + 1) = CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)
            Cells(WorksheetRow, WorksheetColumn + 2) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 17, 39))
            Cells(WorksheetRow, ValueSets) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14))
            WorksheetColumn = WorksheetColumn + 3
            ValueSets = ValueSets + 1

        End If

        CurrentServerRow = CurrentServerRow + 1

        If CurrentServerRow > 41 Then

            WhileLoopHolder = 0

        End If

    Loop

End Sub

Sub NewWorksheetLine_Sub()

        WorksheetRow = WorksheetRow + 1
        WorksheetColumn = 1
        ValueSets = 10

End Sub

此代码嵌套在另一个程序的循环中,从而拉出数千行并整齐地组织它们。它还需要数小时,浪费宝贵的时间,可用于分析从服务器获取的数据。我设法将基本代码重构为数据结构,并使用我的学习来重构其他代码。不幸的是,我错误地重构了这个特殊的代码,因为我无法正确地模仿业务逻辑。我的片段如下:

Sub AcquireData()
'This code refactors the data into a datastructure from a range object, but does not really capture the logic.
'Also, There is an error in attempting to insert a variant array into a collection/dictionary data structure.


CurrentServerRow = 13

ReDim SourceDataArray(10)

WhileLoopHolder = 1

If Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)) <> "" Then

    NewWorksheetLine_Sub

End If

Do While WhileLoopHolder = 1

    If CurrentSession.Screen.Getstring(CurrentServerRow, 9, 1) = "-" Then

        If Trim(CurrentSession.Screen.Getstring(CurrentServerRow + 1, 15, 1)) <> "" Then

            NewWorksheetLine_Sub

        End If

    ElseIf Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)) = "" Then

        If Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14)) <> "" Then

            ReDim Preserve SourceDataArray(ValueSets)
            SourceDataArray(ValueSets) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14))

            ValueSets = ValueSets + 1
            ReDim Preserve SourceDataArray(ValueSets)
        End If

    Else

        If CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1) = "" Then

            ReDim Preserve SourceDataArray(WorkSheetColumn)
            SourceDataArray(WorkSheetColumn) = "X"

        Else

            SourceDataArray(WorkSheetColumn) = CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1)

        End If

        SourceDataArray(WorkSheetColumn + 1) = CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)

        SourceDataArray(WorkSheetColumn + 2) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 17, 39))

        SourceDataArray(ValueSets) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14))

        WorkSheetColumn = WorkSheetColumn + 3
        ValueSets = ValueSets + 1
        ReDim Preserve SourceDataArray(ValueSets)

    End If

    CurrentServerRow = CurrentServerRow + 1

    If CurrentServerRow > 41 Then

        WhileLoopHolder = 0

    End If

Loop

End Sub

Sub NewWorksheetLine_Sub()

SourceIndexAsString = SourceCollectionIndex

   SourceDataCollection.Add SourceDataArray(), SourceIndexAsString

    SourceCollectionIndex = SourceCollectionIndex + 1
    WorkSheetColumn = 1
    ValueSets = 10

End Sub

我认为为了使用相同类型的“单元格”逻辑,我可能想要使用嵌套在数组中的数组,然后将其转置到工作表中。但是,到目前为止,我在过去几周内未能成功实施任何此类解决方案。此外,可能有一种将逻辑重构为数据结构形式的优良方法。但是,我一直无法确定如何成功。

总而言之,我的问题如下:我可以用什么方式将“基于单元”的逻辑转换为数据结构逻辑?这样做的最佳数据结构是什么?在这种特殊情况下,如何使用此业务逻辑实现数据结构逻辑的使用?

3 个答案:

答案 0 :(得分:1)

ReDim Preserve的某些用法似乎有问题。

If CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1) = "" Then
  ReDim Preserve SourceDataArray(WorkSheetColumn)
  SourceDataArray(WorkSheetColumn) = "X"

因此,如果WorksheetColumn的值为1,我们会将SourceDataArray缩减为一个条目,并丢弃数组中较高位置的所有数据。

Else
  SourceDataArray(WorkSheetColumn) = CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1)
End If

SourceDataArray(WorkSheetColumn + 1) = CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)

SourceDataArray(WorkSheetColumn + 2) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 17, 39))

现在我们可能会查看SourceDataArray中不存在的条目(即当遵循上面的If分支而不是Else分支时),我们应该得到一个“下标超出范围“错误

ReDim Preserve仅保留数组元素的数据,这些元素对新数组大小有意义。因此,如果我们有ReDim a(10)然后有ReDim Preserve a(5)(并假设数组从元素0开始 - 即没有Option Base 1)那么a(5)a(9)现在是无法访问,他们包含的数据丢失

答案 1 :(得分:1)

要将使用单元格引用的代码重构为数组,您需要使用 2维数组
单元格引用是基于1的,因此您也应该坚持使用数组中的内容。

您可以使用Range.Value属性

将范围复制到数组中或从数组中复制范围
' Range to array
Dim a as Variant
a = Range("A1:J100").Value

将导致a成为大小为1 To 100, 1 To 10

的变体数组
' Array to Range
Dim a(1 To 100, 1 To 10) as Variant
' populate a
' ...
' Put a into a range
Range("A1:J100").Value = a

这两个代码段产生相同的输出,但第二个很多

Dim r as Long, c as Long
For r = 1 To 1000
For c = 1 To 100
    Cells(r, c) = r * c
Next c, r


Dim r as Long, c as Long
Dim a() as Variant 
Redim a(1 To 1000, 1 To 100)   
For r = 1 To 1000
For c = 1 To 100
    a(r, c) = r * c
Next c, r
Range("A1:CV1000") = a

ReDim Preserve是一项相对昂贵的操作,因此在块中ReDim更快

而不是这个

Redim a(1 To 10, 1 To 1)
For 1 = 1 to 100000
    Redim Preserve a(1 To 10, 1 To i)
    a(i) = SomeValue
Next

改为

Redim a(1 To 10, 1 To 1000)
For 1 = 1 to 100000
    If i > UBound(a) Then
        Redim Preserve a(1 To 10, 1 To UBound(a) + 1000)
    End If
    a(i) = SomeValue
Next
Redim Preserve a (1 To 10, 1 To i - 1)

Redim Preserve只能更改多维数组的最后一个维度。

例如,这是有效的

Redim a(1 to 100, 1 To 10)
Redim Preserve a(1 to 100, 1 To 20)

这不起作用

Redim a(1 to 100, 1 To 10)
Redim Preserve a(1 to 200, 1 To 20)

通常在处理表示范围的数组时,它的行数变化最大。这是一个问题,因为Range.Value数组是(1 To Rows, 1 To Columns)

解决方法是实际确定数组(1 To Columns, 1 To Rows)的尺寸。根据需要Redim行数,然后Transpose进入目标范围

Dim r As Long, c As Long
Dim a() As Variant
ReDim a(1 To 100, 1 To 200)
For r = 1 To 1000
For c = 1 To 100
    If r > UBound(a, 2) Then
        ReDim Preserve a(1 To UBound(a, 1), 1 To UBound(a, 2) + 200)
    End If
    a(c, r) = r * c
Next c, r
Range("A1:CV1000") = Application.Transpose(a)

如果您需要更改两个维度,要更改第一个维度,则需要创建所需大小的新数组,并将数据从旧数组复制到新数组。再次,像块一样重新整理以避免过多的redim's

最后一件事:你似乎没有Dim你的变量(除非你刚刚将这部分留在了你的帖子之外)。我建议您使用Option ExplicitDim所有变量。这有助于避免数据类型错误,并避免使用Variant来处理所有事情。当你需要Variants时很好,但是当你不需要时,其他数据类型通常会更快。

答案 2 :(得分:0)

一旦我花了几周时间将其他宏从基于范围的逻辑重构为抽象的数据结构逻辑,一旦我回到这个宏,答案就会打动我。如果我只是模仿范围逻辑以便更快地完成宏,那么我只需要填充数组,使其在转置后匹配范围。这意味着我不需要修剪数组或以任何方式操纵其形式 - 我只需要以数组形式填充数据结构,然后将其转置到电子表格中。一旦数组填满,我也可以替代使用数据。

以下是解决方案代码:

Sub AcquireData()

'The array 'MyArray' was dimensioned as a dynamic array in the declarations section at the top of the module.
'Redim the array to a big 2 dimensional array that fits the needs of the data/macro.
ReDim MyArray(1 To 20, 1 To 20000)

'From here on, simply mimic the logic of the range macro... [i]
CurrentServerRow = 13

WhileLoopHolder = 1

If Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)) <> "" Then

    NewWorksheetLine_Sub

End If

Do While WhileLoopHolder = 1

    If CurrentSession.Screen.Getstring(CurrentServerRow, 9, 1) = "-" Then

        If Trim(CurrentSession.Screen.Getstring(CurrentServerRow + 1, 15, 1)) <> "" Then

            NewWorksheetLine_Sub

        End If

    ElseIf Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)) = "" Then

        If Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14)) <> "" Then

            '[i] ... except, move the values into the array in Column, Row logic form.
            MyArray(ValueSets, WorksheetRow) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14))
            ValueSets = ValueSets + 1
        End If

    Else

        If CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1) = "" Then

            MyArray(WorksheetColumn, WorksheetRow) = "X"

        Else

            MyArray(WorksheetColumn, WorksheetRow) = CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1)

        End If

        MyArray(WorksheetColumn + 1, WorksheetRow) = CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)
        MyArray(WorksheetColumn + 2, WorksheetRow) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 17, 39))
        MyArray(ValueSets, WorksheetRow) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14))
        WorksheetColumn = WorksheetColumn + 3
        ValueSets = ValueSets + 1

    End If

    CurrentServerRow = CurrentServerRow + 1

    If CurrentServerRow > 41 Then

        WhileLoopHolder = 0

    End If

Loop

ArrayToWorkSheet_Sub

End Sub

Sub NewWorksheetLine_Sub()

    WorksheetRow = WorksheetRow + 1
    WorksheetColumn = 1
    ValueSets = 10

End Sub

'When finished with the loop, push the array to the worksheet, and transpose it to provide the correct column to row relationship in the spreadsheet.
Sub ArrayToWorkSheet_Sub()

Dim ArrayLimit As Long

Dim LastCell As Long

Dim MyRange As Range

'This level of precision in setting the range appears unnecessary, but in theory I think it could speed up tranposing the array - [ii]
'[ii]but that is just speculation. Performance improvements for the tranposition appear to be minor, perhaps due to the fact that [iii]
'[iii]most - if not nearly all - of the intense computations occur earlier.
With Sheets("Sheet2")

ArrayLimit = UBound(MyArray, 2)

LastCell = ArrayLimit + 1

Set MyRange = .Range("A2:S" & LastCell)

MyRange = WorksheetFunction.Transpose(MyArray)

End With

End Sub

虽然Application.ScreenUpdating = FalseApplication.Calculation = xlCalculationManual在减少宏运行时都是非常宝贵的,但我在将这两行与抽象数据结构结合使用方面有非常积极的经验。在某些情况下,似乎数据结构似乎有助于优化性能,尤其是在宏过程中涉及大量逐行数据提取的情况下。