Excel VBA Macro整理了超过80,000多行数据,需要23个小时才能完成

时间:2017-07-12 14:10:48

标签: excel vba excel-vba

我是新来的,刚进入Excel VBA编码。

我遇到了一个我运行的代码片段的问题。我有多个列来显示贷款支付的状态。

  • 第1列显示年份的数值(即Jan = 1,Feb = 2等)。
  • 第4列显示合同的ID号。
  • 第5列显示贷款支付的状态(CURRENT =当前支付,1-23 = 1至23天拖欠,24-59 = 24至59天拖欠等)。
  • 第6列显示了$$$欠款的数量。

我的目标:

  1. 让代码使用ID并在其下方的列表中搜索ID号。
  2. 然后,如果ID中有匹配项,它会检查月份是否是连续的(即当前行中ID的月份是否与循环行中的ID匹配,如果月份是当前行的1,因此,如果它在1月份找到了ID的匹配项,它将在2月份查找相同的ID)。
  3. 然后,它会检查第5栏,看看该帐户是否已从上个月转为违法行为(即如果违法行为是1月份的当前行为,然后是2月份的1-23天违约行为)。
  4. 最后,如果所有这些都是真的(ID的匹配,月份是连续的,并且拖欠时间已经移位),它将减少与拖欠班次相关的列中的欠款(第7-9列)。
  5. 我的问题:我正在运行状态栏更新,以检查它是否正在运行。它每秒大约进行1次迭代。在83,110行数据中,我计算这需要大约23个小时。圣烟!有没有办法加快这个?

    这是我的代码:

    Sub DecipherDPD()
    
    Dim i As Long, CurrentRow As Long, IDSearch As Long
    Dim RawData As Worksheet
    Dim CollectionsWB As Workbook
    
    Set CollectionsWB = ThisWorkbook
    
    Set RawData = CollectionsWB.Worksheets("RAW DATA")
    
    CurrentRow = 2
    
    With Application
        .ScreenUpdating = False
        .EnableEvents = False
    End With
    
    For i = CurrentRow To 83111 Step 1
        For IDSearch = i + 1 To 83111 Step 1
            If RawData.Cells(IDSearch, 4) = RawData.Cells(i, 4) Then
                If RawData.Cells(IDSearch, 1).Value = RawData.Cells(i, 1).Value + 1 Then
                    If RawData.Cells(i, 5) = "CURRENT" Then
                        If RawData.Cells(IDSearch, 5) = "1-23" Then
                            RawData.Cells(IDSearch, 7) = RawData.Cells(IDSearch, 6)
                        End If
                    End If
                End If
                If RawData.Cells(IDSearch, 1).Value = RawData.Cells(i, 1).Value + 1 Then
                    If RawData.Cells(i, 5) = "1-23" Then
                        If RawData.Cells(IDSearch, 5) = "24-59" Then
                            RawData.Cells(IDSearch, 8) = RawData.Cells(IDSearch, 6)
                        End If
                    End If
                End If
                If RawData.Cells(IDSearch, 1).Value = RawData.Cells(i, 1).Value + 1 Then
                    If RawData.Cells(i, 5) = "24-59" Then
                        If RawData.Cells(IDSearch, 5) = "60-90" Then
                            RawData.Cells(IDSearch, 9) = RawData.Cells(IDSearch, 6)
                        End If
                    End If
                End If
                If RawData.Cells(IDSearch, 1).Value = RawData.Cells(i, 1).Value + 1 Then
                    If RawData.Cells(i, 5) = "60-90" Then
                        If RawData.Cells(IDSearch, 5) = "90+" Then
                            RawData.Cells(IDSearch, 9) = RawData.Cells(IDSearch, 6)
                        End If
                    End If
                End If
            End If
        Next IDSearch
    Next i
    
    With Application
        .ScreenUpdating = True
        .EnableEvents = True
    End With
    
    End Sub
    

    任何帮助将不胜感激!谢谢! -Joe

2 个答案:

答案 0 :(得分:0)

编辑:

我只是注意到你确实改变了写操作的列索引。尽管如此,请考虑以下有关使用数组的建议并将逻辑分离出去,它应该可以工作!

根据建议,您应该查看数组读取以加快操作。为了使转换更容易(并且可能会加速您的代码),请尝试使用AndOr来合并逻辑语句并减少If的数量。此版本几乎等效,可以在您的代码中使用想法:

For i = CurrentRow To 83111 Step 1
    For IDSearch = i + 1 To 83111 Step 1
        If RawData.Cells(IDSearch, 4).Value = RawData.Cells(i, 4).Value And RawData.Cells(IDSearch, 1).Value = RawData.Cells(i, 1).Value + 1 Then
            If     (RawData.Cells(i, 5) = "CURRENT" And RawData.Cells(IDSearch, 5) = "1-23") _
                Or (RawData.Cells(i, 5) = "1-23" And RawData.Cells(IDSearch, 5) = "24-59") _ 
                Or (RawData.Cells(i, 5) = "24-59" And RawData.Cells(IDSearch, 5) = "60-90") _ 
                Or (RawData.Cells(i, 5) = "60-90" And RawData.Cells(IDSearch, 5) = "90+") Then
                    RawData.Cells(IDSearch, 7) = RawData.Cells(IDSearch, 6)
            End If
        End If
    Next IDSearch
Next I

然后使用数组看起来像这样:

Dim arr() As Variant
arr = RawData.Range("A" & CurrentRow & ":G" & 83111).Value
For i = LBound(arr,1) To UBound(arr,1)
    For IDSearch = i + 1 To UBound(arr,1)
        If arr(IDSearch, 4) = arr(i, 4) And arr(IDSearch, 1) = arr(i, 1) + 1 Then
            If     (arr(i, 5) = "CURRENT" And arr(IDSearch, 5) = "1-23") _
                Or (arr(i, 5) = "1-23" And arr(IDSearch, 5) = "24-59") _ 
                Or (arr(i, 5) = "24-59" And arr(IDSearch, 5) = "60-90") _ 
                Or (arr(i, 5) = "60-90" And arr(IDSearch, 5) = "90+") Then
                    arr(IDSearch, 7) = arr(IDSearch, 6)
            End If
        End If
    Next IDSearch
Next i
RawData.Range("A" & CurrentRow & ":G" & 83111).Value = arr

不需要ScreenUpdating小提琴,因为屏幕只在最后才更新,单次读取和单次写入操作使得这种方法变得快速。

答案 1 :(得分:0)

有几种不同的方法可以解决这个问题。一个类似于Wolfie(将数据读入数组+循环),另一个使用常规工作表公式。

首先,您需要将数据转换为表格(转到插入 - >表格)并为其/列提供描述性名称(您可以在创建表格后点击“设计”标签来命名表格,看着左上角,有一个小地方可以输入一个名字)。这本身将使您的代码(以及源数据表上的任何公式)更易读,更易于维护。使用表格,如果在源数据中添加/删除/移动列,则无需担心修改宏。

(看起来您的数据基本上已经是表格格式,因此创建表格不应该出现问题。但如果确实如此,只需将源数据复制到另一张表格并在那里制作表格。)

然后,您将要按贷款ID和期间对表进行排序。使用已排序的数据通常显着比使用未排序的数据更快,  绝对是在这种情况下,也在many other cases。这就是你的表在这一点上的样子:

img

然后你有几个选择。通过将数据读入数组,其中一个与Wolfie相同,但由于数据已排序,因此您找到的任何“匹配”将始终为1行,并且您可以避免多次循环数据。在我的测试中,这个版本平均约1.5秒,有一个10万行表。

其他工作方式类似,但使用工作表公式。它有点慢(使用100k行表平均约为5秒)但如果您需要进行修改,我发现它更容易使用。

无论哪种方式,这都是事后的样子:

img

Sub calculateLoanStatus()

    Application.ScreenUpdating = False

    Dim startTimeSetup As Double, endTimeSetup As Double
    Dim startTimeArr As Double, endTimeArr As Double
    Dim startTimeForm As Double, endTimeForm As Double

    'Sort data in table by loan ID and period
    startTimeSetup = Timer
    Dim loanTbl As ListObject
    Set loanTbl = ThisWorkbook.Sheets("Raw Data").ListObjects("LoanTable")
    Call sortColumn(loanTbl, "Period", xlAscending)
    Call sortColumn(loanTbl, "Loan ID", xlAscending)
    endTimeSetup = Timer - startTimeSetup

    'Option 1: Use helper method to copy data into array + paste it back to the table
    'Faster, but slightly more difficult to maintain
    startTimeArr = Timer
    Call versionOne(loanTbl)
    endTimeArr = Timer - startTimeArr + endTimeSetup

    'Option 2: Use helper method to create formulas + copy the results to each column
    'Slower, but slightly easier to maintain
    startTimeForm = Timer
    Call versionTwo(loanTbl, "Current to 1-23", "Current", "1 to 23")
    Call versionTwo(loanTbl, "1-23 to 24-59", "1 to 23", "24 to 59")
    Call versionTwo(loanTbl, "24-59 to 60-90", "24 to 59", "60 to 90")
    Call versionTwo(loanTbl, "60-90 to 90+", "60 to 90", "90+")
    With loanTbl.DataBodyRange
        .Copy
        .PasteSpecial xlPasteValues
        Application.CutCopyMode = False
    End With
    endTimeForm = Timer - startTimeForm + endTimeSetup

    'End
    Application.ScreenUpdating = True
    MsgBox "Array version: " & endTimeArr & vbNewLine & "Formula version: " & endTimeForm

End Sub
Sub versionOne(tbl As ListObject)
'Read data from a sorted table (tbl) into an array and calculate values

    Dim arr() As Variant
    Dim i As Long
    Dim nextRow As Long
    Dim startCol As Long
    Dim periodCol As Long, loanIDCol As Long, statusCol As Long, amountCol As Long
    Dim resultCol1 As Long, resultCol2 As Long, resultCol3 As Long, resultCol4 As Long

    startCol = tbl.ListColumns(1).Range.Column
    periodCol = startCol + tbl.ListColumns("Period").Range.Column - 1
    loanIDCol = startCol + tbl.ListColumns("Loan ID").Range.Column - 1
    statusCol = startCol + tbl.ListColumns("Status").Range.Column - 1
    amountCol = startCol + tbl.ListColumns("Amount Owed").Range.Column - 1
    resultCol1 = startCol + tbl.ListColumns("Current to 1-23").Range.Column - 1
    resultCol2 = startCol + tbl.ListColumns("1-23 To 24-59").Range.Column - 1
    resultCol3 = startCol + tbl.ListColumns("24-59 To 60-90").Range.Column - 1
    resultCol4 = startCol + tbl.ListColumns("60-90 To 90+").Range.Column - 1

    arr = tbl.DataBodyRange.Value

    For i = LBound(arr, 1) To UBound(arr, 1) - 1
        nextRow = i + 1
        arr(nextRow, resultCol1) = ""
        arr(nextRow, resultCol2) = ""
        arr(nextRow, resultCol3) = ""
        arr(nextRow, resultCol4) = ""
        If arr(nextRow, loanIDCol) = arr(i, loanIDCol) And arr(nextRow, periodCol) = arr(i, periodCol) + 1 Then
            If (arr(i, statusCol) = "Current" And arr(nextRow, statusCol) = "1 to 23") Then
                arr(nextRow, resultCol1) = arr(nextRow, amountCol)
            ElseIf (arr(i, statusCol) = "1 to 23" And arr(nextRow, statusCol) = "24 to 59") Then
                arr(nextRow, resultCol2) = arr(nextRow, amountCol)
            ElseIf (arr(i, statusCol) = "24 to 59" And arr(nextRow, statusCol) = "60 to 90") Then
                arr(nextRow, resultCol3) = arr(nextRow, amountCol)
            ElseIf (arr(i, statusCol) = "60 to 90" And arr(nextRow, statusCol) = "90+") Then
                arr(nextRow, resultCol4) = arr(nextRow, amountCol)
            End If
        End If
    Next i

    tbl.DataBodyRange.Value = arr

End Sub
Sub versionTwo(tbl As ListObject, shiftCol As String, oldStatus As String, newStatus As String)
'Add formula to a column (shiftCol) in a table (tbl)
'Use loan account's status in the prior (oldStatus) and current (newStatus) months to whether to add value
'If desired, copy the results of the formula in place (copyText)

    Dim refCol As Long
    Dim periodDist As Long
    Dim loanIDDist As Long
    Dim statusDist As Long
    Dim formulaText As String

    'Determine the distance between your reference column and each value column
    refCol = tbl.ListColumns(shiftCol).Range.Column
    periodDist = tbl.ListColumns("Period").Range.Column - refCol
    loanIDDist = tbl.ListColumns("Loan ID").Range.Column - refCol
    statusDist = tbl.ListColumns("Status").Range.Column - refCol

    'Make formula and add it to the column
    formulaText = "=IF(AND(R[-1]C[" & periodDist & "]=[@Period]-1,R[-1]C[" & loanIDDist & "]=[@[Loan ID]]," & _
                  "R[-1]C[" & statusDist & "]=""" & oldStatus & """,[@Status]=""" & newStatus & """),[@[Amount Owed]],"""")"

    tbl.ListColumns(shiftCol).DataBodyRange.FormulaR1C1 = formulaText

End Sub
Sub sortColumn(tbl As ListObject, toSort As String, sOrder As Variant)
'Sorts column (toSort) in a table (tbl) in a given order (sOrder)

    With tbl.Sort
        .SortFields.Clear
        .SortFields.Add Key:=tbl.ListColumns(toSort).Range, _
                SortOn:=xlSortOnValues, _
                Order:=sOrder, _
                DataOption:=xlSortNormal
        .Apply
    End With

End Sub