纸上有700k行的Countif冻结程序

时间:2018-12-04 23:17:00

标签: excel excel-vba bigdata access countif

我目前有两个列表。 A列中的“授予者”列表和B列中除去重复项的相同列表。我试图使用countif来计算给定授予者在A列中的次数,但是我在A列中的列表超过70万行。我使用的是64位excel,但是每次我运行代码来执行此excel都会冻结并崩溃。

是否可以在excel中做到这一点,还是需要采取其他方法,例如使用数据透视表或在Access中创建表?

我已经写了一些子例程,但这是最新的,是从该论坛上的另一篇帖子中获得的。

Sub Countif()

  Dim lastrow As Long
  Dim rRange As Range
  Dim B As Long '< dummy variable to represent column B

  B = 2

  With Application
    .ScreenUpdating = False 'speed up processing by turning off screen updating
    .DisplayAlerts = False
  End With

  'set up a range to have formulas applied
  With Sheets(2)
    lastrow = Cells(Rows.Count, "A").End(xlUp).Row
    Set rRange = .Range(.Cells(2, B), .Cells(lastrow, B))
  End With

  'apply the formula to the range
  rRange.Formula = "=COUNTIF($A$2:$A$777363,C2)"
  'write back just the value to the range
  rRange.Value = rRange.Value

  With Application
    .ScreenUpdating = True
    .DisplayAlerts = True
  End With

End Sub

2 个答案:

答案 0 :(得分:1)

类似这样的东西:

Sub Countif()

    Dim allVals, uniqueVals, i As Long, dict, v, dOut(), r As Long

     ''creating dummy data
'    With Sheet2.Range("A2:A700000")
'        .Formula = "=""VAL_"" & round(RAND()*340000,0)"
'        .Value = .Value
'    End With
'

    'get the raw data and unique values
    With Sheet2
        allVals = .Range("A2:A" & .Cells(.Rows.Count, "A").End(xlUp).Row).Value
        uniqueVals = .Range("B2:B" & .Cells(.Rows.Count, "B").End(xlUp).Row).Value
    End With
    ReDim dOut(1 To UBound(uniqueVals, 1), 1 To 1) 'for counts...

    Set dict = CreateObject("scripting.dictionary")
    'map unique value to index
    For i = 1 To UBound(uniqueVals, 1)
        v = uniqueVals(i, 1)
        If Len(v) > 0 Then dict(v) = i
    Next i


    'loop over the main list and count each unique value in colB
    For i = 1 To UBound(allVals, 1)
        v = allVals(i, 1)
        If Len(v) > 0 Then
            If dict.exists(v) Then
                r = dict(v)
                dOut(r, 1) = dOut(r, 1) + 1
            End If
        End If
    Next i

    'output the counts
    Sheet2.Range("C2").Resize(UBound(dOut, 1), 1).Value = dOut

End Sub

运行时间约为30秒,A中的值为700k,B中的唯一性为300k

答案 1 :(得分:1)

...或者也许是

  

警告:这会覆盖目标工作表的A列中重复数据删除的值。

Option Explicit

Sub countUnique()
    Dim arr As Variant, i As Long, dict As Object

    Debug.Print Timer

    Set dict = CreateObject("scripting.dictionary")
    dict.comparemode = vbTextCompare

    With Worksheets("sheet2")
        arr = .Range(.Cells(2, "A"), .Cells(.Rows.Count, "A").End(xlUp)).Value2
    End With

    For i = LBound(arr, 1) To UBound(arr, 1)
        dict.Item(arr(i, 1)) = dict.Item(arr(i, 1)) + 1
    Next i

    With Worksheets("sheet3")
        .Cells(2, "A").Resize(dict.Count, 1) = bigTranspose(dict.keys)
        .Cells(2, "B").Resize(dict.Count, 1) = bigTranspose(dict.items)
    End With

    Debug.Print Timer

End Sub

Function bigTranspose(arr1 As Variant)
    Dim t As Long
    ReDim arr2(LBound(arr1) To UBound(arr1), 1 To 1)

    For t = LBound(arr1) To UBound(arr1)
        arr2(t, 1) = arr1(t)
    Next t
    bigTranspose = arr2
End Function
对于Surface Pro平板电脑上的700K原稿和327K唯一标识,

42.64秒。通过关闭计算和启用事件,可以改善这一点。屏幕更新确实不是问题。