切片数组使用大于65000的索引

时间:2018-02-03 14:07:24

标签: arrays vba indexing

我使用下面的代码计算最大值,如本文所述(vba max value of group of values)。代码工作得很好但是一旦我有超过65k行,我在尝试使用数组时会得到数据类型不匹配:

sht.Range(Cells(1, lColumn), Cells(last.Row, lColumn)).Value = Application.Index(groupsArray, , lColumn)

有人可以帮我把数组切成块。我试图让它自己工作,但没有任何运气。

    Sub FillGroupsMax()
        Dim lColumn As Long
        Dim sht As Worksheet
        Dim groupsArray As Variant    'array with all group infomation
        Dim groupsSeen As Variant    'array with group infomation already seen

        Application.ScreenUpdating = False    'stop screen updating makes vba perform better

        Set sht = ThisWorkbook.Worksheets("import")
        Set last = sht.Range("A:A").Find("*", Cells(1, 1), searchdirection:=xlPrevious)    'last cell with value in column A
        lColumn = sht.Cells(1, Columns.Count).End(xlToLeft).Column

        groupsArray = sht.Range(Cells(1, 1), Cells(last.Row, lColumn))
        'collect all the information on the Sheet into an array
        'Improves performance by not visiting the sheet

        For dRow = 2 To last.Row    'for each of the rows skipping header

            'check if group as already been seen
            If inArrayValue(Cells(dRow, 1).Value, groupsSeen) > 0 Then
                'if it has been seen/calculated attribute value
                'Cells(dRow, 4).Value = inArrayValue(Cells(dRow, 1).Value, groupsSeen)
                groupsArray(dRow, lColumn) = inArrayValue(Cells(dRow, 1).Value, groupsSeen)
            Else
                'if it hasn't been seen then find max
                'Cells(dRow, 4).Value = getMax(Cells(dRow, 1).Value, groupsArray)
                groupsArray(dRow, lColumn) = getMax(Cells(dRow, 1).Value, groupsArray, lColumn)

                'array construction from empty
                If IsEmpty(groupsSeen) Then
                    ReDim groupsSeen(0)
                    'groupsSeen(0) = Array(Cells(dRow, 1).Value, Cells(dRow, 4).Value)
                    groupsSeen(0) = Array(groupsArray(dRow, 1), groupsArray(dRow, lColumn))
                    'attribute value to array
                Else
                    ReDim Preserve groupsSeen(0 To UBound(groupsSeen) + 1)
                    groupsSeen(UBound(groupsSeen)) = Array(groupsArray(dRow, 1), groupsArray(dRow, lColumn))
                End If
            End If
        Next

    sht.Range(Cells(1, lColumn), Cells(last.Row, lColumn)).Value = Application.Index(groupsArray, , lColumn)
        'reactivate Screen updating
        Application.ScreenUpdating = True

    End Sub

    Function getMax(group As String, groupsArray As Variant, lColumn As Long) As Double

        'for each in array
        For n = 1 To UBound(groupsArray)
            'if its the same group the Max we seen so far the record
            If groupsArray(n, 1) = group And groupsArray(n, lColumn - 1) > maxSoFar Then
                maxSoFar = groupsArray(n, lColumn - 1)
            End If
        Next

        'set function value
        getMax = maxSoFar
    End Function

    Function inArrayValue(group As String, groupsSeen As Variant) As Double

        'set function value
        inArrayValue = 0
        'if array is empty then exit
        If IsEmpty(groupsSeen) Then Exit Function

        'for each in array
        For n = 0 To UBound(groupsSeen)
            'if we find the group
            If groupsSeen(n)(0) = group Then
                'set function value to the Max value already seen
                inArrayValue = groupsSeen(n)(1)
                'exit function earlier
                Exit Function
            End If
        Next

    End Function

4 个答案:

答案 0 :(得分:2)

您可以编写一个帮助函数来代替Application.Index

奖金 - 它比使用Index(> 5x)

快得多
Sub Tester()

    Dim arr, arrCol

    arr = Range("A2:J80000").Value

    arrCol = GetColumn(arr, 5) '<< get the fifth column

    Range("L2").Resize(UBound(arrCol, 1), 1).Value = arrCol

End Sub

'extract a single column from a 1-based 2-D array
Function GetColumn(arr, colNumber)
    Dim arrRet, i As Long
    ReDim arrRet(1 To UBound(arr, 1), 1 To 1)
    For i = 1 To UBound(arr, 1)
        arrRet(i, 1) = arr(i, colNumber)
    Next i
    GetColumn = arrRet
End Function

编辑 - 因为QHarr在这里询问了时间,这是一个基本的例子

Sub Tester()
    Dim arr, arrCol, t, i as long
    arr = Range("A2:J80000").Value
    t = Timer
    For i = 1 to 100 
        arrCol = GetColumn(arr, 5) '<< get the fifth column
    Next i
    Debug.print Timer - t '<<# of seconds for execution
End Sub

答案 1 :(得分:0)

您使用的是.xls文件,因为它们只有65k行。如果是这样,将文件保存为.xlsx,因为它将有超过一百万行

答案 2 :(得分:0)

下面,尽管不是那么整洁,但是一种以块的形式处理数组的方法和索引来访问列并写出到表格。

我用数据填充了两列(A:B)。两者都有132,000行,逐步填充,每列中的值为1到132,000,用于我的测试运行。

您可以使用cutOff来调整块大小,使其低于发生故障的位置。

下面的代码只是为了演示批量循环的原理,直到每批中的设置截止值,直到所有行都被处理完毕。

 Option Explicit

 Public Sub WriteArrayToSheet()

   Dim wb As Workbook
   Dim ws As Worksheet

   Set wb = ThisWorkbook
   Set sht = wb.Worksheets("Sheet1") 'change as appropriate

   Dim myArr() 'dynamic array

   myArr = sht.Range("A1").CurrentRegion.Value 'you may want a more robust method

   Dim cutOff As Long 'the max value - what ever it is before error occurs
   cutOff = 1000

   Dim totalRows As Long 'total rows in array read in from sheet
   totalRows = UBound(myArr, 1)

   Dim totalArraysNeeded As Long

   'Determine how many lots of cutOff chunks there are in the total number of array rows
   totalArraysNeeded = Application.WorksheetFunction.Ceiling(totalRows / cutOff, 1)

   Dim rotations As Long 'number of times to loop original array to handle all rows
   Dim rowCountTotal As Long
   Dim rowCount As Long
   Dim tempArr() 'this will hold the chunk of the original array

   Dim rowCounter As Long
   Dim lastRow As Long
   Dim nextRow As Long
   Dim i As Long
   Dim j As Long
   Dim numRows As Long

   rotations = 1

   Do While rotations < totalArraysNeeded

       If rotations < totalArraysNeeded - 1 Then

           ReDim tempArr(1 To cutOff, 1 To UBound(myArr, 2)) 'size chunk array
           numRows = cutOff

       Else
           numRows = totalRows - rowCountTotal
           ReDim tempArr(1 To numRows, 1 To UBound(myArr, 2)) 'size chunk array

       End If

       For i = 1 To numRows

           rowCount = 1 'rows in this chunk looped
           rowCountTotal = rowCountTotal + 1  'rows in original array looped

           For j = LBound(myArr, 2) To UBound(myArr, 2)

               tempArr(i, j) = myArr(rowCountTotal, j)

           Next j

           rowCount = rowCount + 1

       Next i

       With sht
           lastRow = .Cells(.Rows.Count, "E").End(xlUp).Row 'Column where I am writing the sliced column out to
       End With

       If lastRow = 1 Then
           nextRow = 1
       Else
           nextRow = lastRow + 1
       End If

       sht.Range("E" & nextRow).Resize(UBound(tempArr, 1), 1) = Application.Index(tempArr, , 1) 'write out to sheet

       rotations = rotations + 1

   Loop

End Sub

答案 3 :(得分:0)

正如@Tim建议的那样,切片大数组的最佳方法是使用循环来复制列。

虽然在您的情况下,由于您的代码使用嵌套循环,因此大部分处理时间都用于计算最大值。

如果您想显着缩短处理时间,请使用字典:

Sub Usage
   GetMaxByGroupTo _
        sourceGroups := ThisWorkbook.Range("Sheet1!A2:A100"), _
        sourceValues := ThisWorkbook.Range("Sheet1!B2:B100"), _
        target := ThisWorkbook.Range("Sheet1!C2")
End Sub

Sub GetMaxByGroupTo(sourceGroups As Range, sourceValues As Range, target As Range)
    Dim dict As Object, groups(), values(), r As Long, max

    Set dict = CreateObject("Scripting.Dictionary")
    groups = sourceGroups.Value2
    values = sourceValues.Value2

    ' store the maximum value of each group in a dictionary for an efficient lookup '

    For r = Lbound(groups) to Ubound(groups)
        max = dict(groups(r, 1))
        If VarType(max) And values(r, 1) <= max Then Else dict(groups(r, 1)) = values(r, 1)
    Next

    ' build and copy the result array to the sheet  '

    For r = Lbound(groups) to Ubound(groups)
        values(r, 1) = dict(groups(r, 1))
    Next

    target.Resize(Ubound(groups), 1).Value2 = values

End Sub