我有18000行和26列。
示例数据:
A(Name) B(Mat_Num) C(Items) D(group) E(Summon) F(Plant) G(Batch_num)
1.Ram 1235 HA1 Micro 545.5 1327 893A1
2.ram 12354 rt2 Senf 5678 0001 1063F
3.Joseph 12354 cf1 Macro 9844 0001 1063F
4.andreas 12354 dw1 HR 6633.95 0001 1063F
5.John 1235 ff1 Finance 22555.09 1327 893A1
6.Russel 987 ad1 Sales 6423 0001 jjg67
7.Holger 00 dd1 purchase 3333 1327 dd567
8.Gottfried 234 fa1 rot 663 345 45678
我必须根据列(B,F,G)找到重复的行。如果这三列的行相同,则将E列的单元格值与一行相加,并删除重复的行以仅保留其中一行。
结果:
A(Name) B(Mat_Num) C(Items) D(group) E(Summon) F(Plant) G(Batch_num)
1.Ram 1235 HA1 Micro 23101 1327 893A1
2.ram 12354 rt2 Senf 22155.95 0001 1063F
我已经浏览了一些网站和博客,想出下面发布的代码。
Sub Sample()
Dim LastRowcheck As Long, n1 As Long
Dim DelRange As Range
With Worksheets("Sheet1")
LastRowcheck = .Range("A" & .Rows.Count).End(xlUp).Row
For n1 = 1 To LastRowcheck
If .Cells(n1, 1).Value = Cells(n1 + 1, 1).Value Then
If DelRange Is Nothing Then
Set DelRange = .Rows(n1)
Else
Set DelRange = Union(DelRange, .Rows(n1))
End If
End If
Next n1
If Not DelRange Is Nothing Then DelRange.Delete
End With
End Sub
答案 0 :(得分:1)
这应该很快就能解决它。只要18K行数据可以处理成总和。
Sub Sum_and_Dedupe()
With Worksheets("sheet1")
'deal with the block of data radiating out from A1
With .Cells(1, 1).CurrentRegion
'step off the header and make one column wider
With .Resize(.Rows.Count - 1, .Columns.Count + 1).Offset(1, 0)
.Columns(.Columns.Count).Formula = "=sumifs(e:e, b:b, b2, f:f, f2, g:g, g2)"
.Columns(5) = .Columns(.Columns.Count).Value
.Columns(.Columns.Count).Delete
End With
'remove duplicates
.RemoveDuplicates Columns:=Array(2, 6, 7), Header:=xlYes
End With
.UsedRange
End With
End Sub
对于18K行的随机数据,这需要大约18秒。你自己的结果将根据硬件和软件而有所不同,但这应该是大概的。
答案 1 :(得分:1)
这里"棒球场" #2
Sub main()
Dim helperRng As Range
With Worksheets("Sheet01")
With .UsedRange
Set helperRng = .Offset(, .Columns.Count + 1).Resize(, 1)
With helperRng
.FormulaR1C1 = "=concatenate(RC2, RC6, RC7)"
.Offset(, 1).FormulaR1C1 = "=if(countif(R1C[-1]:RC[-1], RC[-1])=1,1,"""")"
With .Offset(, 2)
.FormulaR1C1 = "=sumif(C[-2], RC[-2],C5)"
.Value = .Value
End With
.Offset(, 1).SpecialCells(xlCellTypeFormulas, xlTextValues).EntireRow.Delete
Worksheets("Sheet01").Columns(5).Resize(.Rows.Count - 1).Offset(1).Value = .Offset(1, 2).Resize(.Rows.Count - 1).Value
helperRng.Resize(, 3).Clear
End With
End With
End With
End Sub
只有好奇哪个更快!
答案 2 :(得分:0)
可以使用数组和字典对象在18毫秒(略微夸张)中完成此操作。我知道求和的值在第4列中,从而简化了函数。您可以在其他列中为多个值调整代码。我正在从1个数组写入另一个数组(从InAy到OutAy),字典确定行是否已存在。魔术发生在字典的Item属性中。当写入新的OutAy行时,我将item属性值分配给行(r)。然后,当它已经存在时,我使用item属性值检索将其写入OutAy的行(r):d.item(KeyIn)然后可以用现有值的总和更新OutAy(r,c)中的该值和新值“ KeyVal”。
这解决了与SQL查询聚合相同的问题:“从数据组中按a,b,c选择a,b,c,sum(d)”
注意:添加一个工具->对Microsoft脚本运行时的引用
sub some()
...
data = Range("WhereYourDataIs") 'create data array
Range("WhereYourDataIs").clear 'assumes you'll output to same location
data = RemoveDupes(data) 'removedupes and sum values
Range("A2").Resize(UBound(data), UBound(data, 2)) = data 'A2 assumes your data headers begin in row 1, column 1
...
End Sub
Function RemoveDupes(InAy As Variant) As Variant
Dim d As Scripting.Dictionary
Set d = New Scripting.Dictionary
ReDim OutAy(1 To UBound(InAy), 1 To 4)
r = 1
For i = 1 To UBound(InAy)
KeyIn = ""
KeyVal = InAy(i, 4) 'the value field to sum/aggregate if exists
For c = 1 To 3 'a, b, c metadata to roll up
KeyIn = KeyIn & InAy(i, c)
Next c
If d.Exists(KeyIn) Then
OutAy(d.item(KeyIn), 4) = OutAy(d.item(KeyIn), 4) + KeyVal 'the summation of value field for existing row in OutAy
Else:
d.Add KeyIn, r 'r is set as the item value referencing the row of the OutAy when it was first added. The reference is used when .Exists is true
For c = 1 To 4
OutAy(r, c) = InAy(i, c)
Next c
r = r + 1
End If
Next
RemoveDupes = OutAy
End Function