使用包含超过3000列的Excel文件,并且存在多个重复列标题的问题。当这些数据被送入另一个系统时,它就会出错。寻找合并电子表格中的列的方法。相同的标题可以出现在两到六列中,但每行数据只会填充一列。
我已经看到一个合并重复项列标题的帖子彼此相邻,我可以这样做(就像我在示例数据中那样),但该帖子只合并了标题数据。
不确定如何附加样本数据,但希望人们可以看到:
1350725 1350725 1350740 1350813 1351468 1351468
B A C
A C B E
C D C E
A C C D
B E B
答案 0 :(得分:2)
您可以使用电源查询轻松完成此操作。它是Excel 2010+的加载项(默认情况下在Excel 2016中称为Get& Transform)。在那里,您可以直接将Excel与任何数据源连接,然后在查询编辑器中转换数据。对于您的情况,请按照以下步骤操作:
答案 1 :(得分:2)
希望这应该有效。 我使用存储数组的字典来重复删除列。 请注意,您必须在注释中设置引用(或进行一些小的更改以使用后期绑定)。此外,您还需要更改源和结果工作表名称以与数据保持一致。
此外,假设源数据表是此工作表上的唯一内容,它从A1
开始。 LastRowCol
函数检测数据的结束点。
如果您的源数据表不符合这些要求,则需要进行更改以检测正确的数据区域。
'Set reference to Microsoft Scripting Runtime
Option Explicit
Sub CombineColumns()
Dim wsSrc As Worksheet, wsRes As Worksheet, rRes As Range
Dim vSrc As Variant, vRes As Variant
Dim dD As Dictionary
Dim I As Long, J As Long
Dim lLastRowCol() As Long
Dim V() As Variant
'set Source and REsults worksheets, ranges
Set wsSrc = Worksheets("sheet5")
Set wsRes = Worksheets("sheet6")
Set rRes = wsRes.Cells(1, 1)
'Get source data into vba array
With wsSrc
lLastRowCol = LastRowCol(wsSrc.Name)
vSrc = .Range(.Cells(1, 1), .Cells(lLastRowCol(0), lLastRowCol(1)))
End With
'Collect and merge the data
Set dD = New Dictionary
ReDim V(2 To UBound(vSrc, 1))
For J = 1 To UBound(vSrc, 2)
If Not dD.Exists(vSrc(1, J)) Then 'set new dictionary item
For I = 2 To UBound(vSrc, 1)
V(I) = vSrc(I, J)
Next I
dD.Add Key:=vSrc(1, J), Item:=V
Else 'combine the columns
For I = 2 To UBound(vSrc, 1)
If vSrc(I, J) <> "" Then
V = dD(vSrc(1, J))
V(I) = vSrc(I, J)
dD(vSrc(1, J)) = V
End If
Next I
End If
Next J
'Write results to output array
ReDim vRes(0 To UBound(vSrc, 1) - 1, 1 To dD.Count)
'Headers
J = 0
Dim V1 As Variant
For Each V1 In dD.Keys
J = J + 1
vRes(0, J) = V1
Next V1
'Data
For J = 1 To UBound(vRes, 2)
I = 0
For Each V1 In dD(vRes(0, J))
I = I + 1
vRes(I, J) = V1
Next V1
Next J
'write the results
Set rRes = rRes.Resize(UBound(vRes, 1) + 1, UBound(vRes, 2))
With rRes
.EntireColumn.Clear
.Value = vRes
With .Rows(1)
.Font.Bold = True
.HorizontalAlignment = xlCenter
End With
.EntireColumn.AutoFit
End With
End Sub
Private Function LastRowCol(Worksht As String) As Long()
Dim WS As Worksheet, R As Range
Dim LastRow As Long, LastCol As Long
Dim L(1) As Long
Set WS = Worksheets(Worksht)
With WS
Set R = .Cells.Find(what:="*", after:=.Cells(1, 1), _
LookIn:=xlValues, searchorder:=xlByRows, _
searchdirection:=xlPrevious)
If Not R Is Nothing Then
LastRow = R.Row
LastCol = .Cells.Find(what:="*", after:=.Cells(1, 1), _
LookIn:=xlValues, searchorder:=xlByColumns, _
searchdirection:=xlPrevious).Column
Else
LastRow = 1
LastCol = 1
End If
End With
L(0) = LastRow
L(1) = LastCol
LastRowCol = L
End Function
原始数据
<强>联合强>
答案 2 :(得分:0)
PowerQuery是迄今为止最好的工具,因为你可以在几分钟内完成一个解决方案,而不会消耗太多的脑力。
但是为了完整性,这里有一个VBA解决方案可以做你想要的,并且还处理两个以上的重复列。它假定这些列始终位于旁边,与样本数据一样。
这需要我在30分钟到60分钟之间进行整理和排除故障,因为我试图优化和处理您删除的列需要一些思考。相比之下,在PQ中组合解决方案可能只需要几分钟。这就是为什么我投票赞成@virtualdvid采用的方法。在效率和稳健性方面,我的方法不像下面的Rick's Dictionary方法那样快速或强大。与PQ或词典相比,这匹马将遥遥领先。
Sub Test()
Dim lLastRow As Long
Dim lLastCol As Long
Dim i As Long
Dim j As Long
Dim k As Long
Dim sHeader1 As String
Dim sHeader2 As String
lLastCol = ActiveSheet.Cells(1, ActiveSheet.Columns.Count).End(xlToLeft).Column
For i = lLastCol To 1 Step -1
sHeader1 = Cells(1, i)
For j = i - 1 To 1 Step -1
sHeader2 = Cells(1, j)
If sHeader2 <> sHeader1 Then Exit For
If sHeader1 = sHeader2 Then
lLastRow = ActiveSheet.Cells(ActiveSheet.Rows.Count, i).End(xlUp).Row
For k = 2 To lLastRow
If Cells(k, i).Value <> "" Then
Cells(k, j).Value = Cells(k, i).Value
End If
Next k
Columns(i).Delete Shift:=xlToLeft
End If
Next j
Next i
End Sub
请注意,这不是最佳VBA。您可以通过关闭screenupdating进一步优化此功能。更好的是,不要这么做,而是将所有数据一次性地拉入变体数组中,然后使用类似的代码进行合并,然后一次性将其转储回工作表中。甚至比这更好的是类似于里克斯的词典方法。
答案 3 :(得分:-1)
这可能对您有用:
Sub Test()
Dim lastcol As Long, lastrow As Long, lastrow2 As Long, i As Long, j As Long, k As Long
lastcol = ActiveSheet.Cells(1, ActiveSheet.Columns.Count).End(xlToLeft).Column
For i = 1 To lastcol
For j = i To lastcol
If Cells(1, i).Value = Cells(1, j).Value And i <> j Then 'Merge em
lastrow = ActiveSheet.Cells(ActiveSheet.Rows.Count, i).End(xlUp).Row
lastrow2 = ActiveSheet.Cells(ActiveSheet.Rows.Count, j).End(xlUp).Row
If lastrow < lastrow2 Then
lastrow = lastrow2
End If
For k = 2 To lastrow
If Cells(k, j).Value <> "" Then
Cells(k, i).Value = Cells(k, j).Value
End If
Next k
Columns(j).Delete Shift:=xlToLeft
Exit For
End If
Next j
Next i
End Sub
测试数据:
关于您的测试数据:
不确定为什么我的照片没有通过......对不起。编辑:看起来现在正在工作。