查找重复行并对列的值求和

时间:2016-04-01 10:49:06

标签: excel vba

我有18000行和26列。

示例数据:

A(Name)     B(Mat_Num)  C(Items) D(group)   E(Summon)   F(Plant) G(Batch_num)
1.Ram       1235         HA1      Micro      545.5      1327      893A1
2.ram       12354        rt2      Senf       5678       0001      1063F
3.Joseph    12354        cf1      Macro      9844       0001      1063F
4.andreas   12354        dw1      HR         6633.95    0001      1063F
5.John      1235         ff1      Finance    22555.09   1327      893A1
6.Russel     987         ad1      Sales      6423       0001      jjg67
7.Holger      00         dd1      purchase   3333       1327      dd567
8.Gottfried   234        fa1      rot        663        345       45678

我必须根据列(B,F,G)找到重复的行。如果这三列的行相同,则将E列的单元格值与一行相加,并删除重复的行以仅保留其中一行。

结果:

 A(Name)     B(Mat_Num)  C(Items) D(group)   E(Summon)   F(Plant) G(Batch_num)
1.Ram       1235         HA1      Micro      23101      1327      893A1
2.ram       12354        rt2      Senf       22155.95   0001      1063F

我已经浏览了一些网站和博客,想出下面发布的代码。

Sub Sample()
    Dim LastRowcheck As Long, n1 As Long
    Dim DelRange As Range

    With Worksheets("Sheet1")
        LastRowcheck = .Range("A" & .Rows.Count).End(xlUp).Row

        For n1 = 1 To LastRowcheck
            If .Cells(n1, 1).Value = Cells(n1 + 1, 1).Value Then
                If DelRange Is Nothing Then
                    Set DelRange = .Rows(n1)
                Else
                    Set DelRange = Union(DelRange, .Rows(n1))
                End If
            End If
        Next n1

        If Not DelRange Is Nothing Then DelRange.Delete
    End With
End Sub

3 个答案:

答案 0 :(得分:1)

这应该很快就能解决它。只要18K行数据可以处理成总和。

Sub Sum_and_Dedupe()
    With Worksheets("sheet1")
        'deal with the block of data radiating out from A1
        With .Cells(1, 1).CurrentRegion
            'step off the header and make one column wider
            With .Resize(.Rows.Count - 1, .Columns.Count + 1).Offset(1, 0)
                .Columns(.Columns.Count).Formula = "=sumifs(e:e, b:b, b2, f:f, f2, g:g, g2)"
                .Columns(5) = .Columns(.Columns.Count).Value
                .Columns(.Columns.Count).Delete
            End With

            'remove duplicates
            .RemoveDuplicates Columns:=Array(2, 6, 7), Header:=xlYes
        End With
        .UsedRange
    End With
End Sub

对于18K行的随机数据,这需要大约18秒。你自己的结果将根据硬件和软件而有所不同,但这应该是大概的。

sum_and_dedupe_before
Sum_and_Dedupe()之前的样本数据

sum_and_dedupe_after
Sum_and_Dedupe()之后的样本数据

答案 1 :(得分:1)

这里"棒球场" #2

Sub main()
Dim helperRng As Range

With Worksheets("Sheet01")
    With .UsedRange
        Set helperRng = .Offset(, .Columns.Count + 1).Resize(, 1)
        With helperRng
            .FormulaR1C1 = "=concatenate(RC2, RC6, RC7)"
            .Offset(, 1).FormulaR1C1 = "=if(countif(R1C[-1]:RC[-1], RC[-1])=1,1,"""")"
            With .Offset(, 2)
                .FormulaR1C1 = "=sumif(C[-2], RC[-2],C5)"
                .Value = .Value
            End With
            .Offset(, 1).SpecialCells(xlCellTypeFormulas, xlTextValues).EntireRow.Delete
            Worksheets("Sheet01").Columns(5).Resize(.Rows.Count - 1).Offset(1).Value = .Offset(1, 2).Resize(.Rows.Count - 1).Value
            helperRng.Resize(, 3).Clear
        End With
    End With
End With

End Sub

只有好奇哪个更快!

答案 2 :(得分:0)

可以使用数组和字典对象在18毫秒(略微夸张)中完成此操作。我知道求和的值在第4列中,从而简化了函数。您可以在其他列中为多个值调整代码。我正在从1个数组写入另一个数组(从InAy到OutAy),字典确定行是否已存在。魔术发生在字典的Item属性中。当写入新的OutAy行时,我将item属性值分配给行(r)。然后,当它已经存在时,我使用item属性值检索将其写入OutAy的行(r):d.item(KeyIn)然后可以用现有值的总和更新OutAy(r,c)中的该值和新值“ KeyVal”。

这解决了与SQL查询聚合相同的问题:“从数据组中按a,b,c选择a,b,c,sum(d)”

注意:添加一个工具->对Microsoft脚本运行时的引用

    sub some()
     ...
     data = Range("WhereYourDataIs") 'create data array
     Range("WhereYourDataIs").clear 'assumes you'll output to same location
     data = RemoveDupes(data) 'removedupes and sum values
     Range("A2").Resize(UBound(data), UBound(data, 2)) = data 'A2 assumes your data headers begin in row 1, column 1
     ...
    End Sub

Function RemoveDupes(InAy As Variant) As Variant
    Dim d As Scripting.Dictionary
    Set d = New Scripting.Dictionary
    ReDim OutAy(1 To UBound(InAy), 1 To 4)
    r = 1

    For i = 1 To UBound(InAy)
        KeyIn = ""
        KeyVal = InAy(i, 4) 'the value field to sum/aggregate if exists
        For c = 1 To 3 'a, b, c metadata to roll up
            KeyIn = KeyIn & InAy(i, c)
        Next c
        If d.Exists(KeyIn) Then
            OutAy(d.item(KeyIn), 4) = OutAy(d.item(KeyIn), 4) + KeyVal 'the summation of value field for existing row in OutAy
            Else:
            d.Add KeyIn, r 'r is set as the item value referencing the row of the OutAy when it was first added. The reference is used when .Exists is true
            For c = 1 To 4
                OutAy(r, c) = InAy(i, c)
            Next c
            r = r + 1
        End If
    Next
    RemoveDupes = OutAy
End Function