在Excel中合并数据

时间:2018-03-28 18:28:48

标签: excel excel-vba vba

使用包含超过3000列的Excel文件,并且存在多个重复列标题的问题。当这些数据被送入另一个系统时,它就会出错。寻找合并电子表格中的列的方法。相同的标题可以出现在两到六列中,但每行数据只会填充一列。

我已经看到一个合并重复项列标题的帖子彼此相邻,我可以这样做(就像我在示例数据中那样),但该帖子只合并了标题数据。

不确定如何附加样本数据,但希望人们可以看到:

1350725  1350725  1350740  1350813  1351468 1351468
B                          A        C
         A        C        B                E
C                 D        C        E  
A                 C        C        D
B                          E                B

4 个答案:

答案 0 :(得分:2)

您可以使用电源查询轻松完成此操作。它是Excel 2010+的加载项(默认情况下在Excel 2016中称为Get& Transform)。在那里,您可以直接将Excel与任何数据源连接,然后在查询编辑器中转换数据。对于您的情况,请按照以下步骤操作:

答案 1 :(得分:2)

希望这应该有效。 我使用存储数组的字典来重复删除列。 请注意,您必须在注释中设置引用(或进行一些小的更改以使用后期绑定)。此外,您还需要更改源和结果工作表名称以与数据保持一致。

此外,假设源数据表是此工作表上的唯一内容,它从A1开始。 LastRowCol函数检测数据的结束点。

如果您的源数据表不符合这些要求,则需要进行更改以检测正确的数据区域。

'Set reference to Microsoft Scripting Runtime
Option Explicit
Sub CombineColumns()
    Dim wsSrc As Worksheet, wsRes As Worksheet, rRes As Range
    Dim vSrc As Variant, vRes As Variant
    Dim dD As Dictionary
    Dim I As Long, J As Long
    Dim lLastRowCol() As Long
    Dim V() As Variant

'set Source and REsults worksheets, ranges
 Set wsSrc = Worksheets("sheet5")
 Set wsRes = Worksheets("sheet6")
    Set rRes = wsRes.Cells(1, 1)

'Get source data into vba array
With wsSrc
    lLastRowCol = LastRowCol(wsSrc.Name)
    vSrc = .Range(.Cells(1, 1), .Cells(lLastRowCol(0), lLastRowCol(1)))
End With

'Collect and merge the data
Set dD = New Dictionary
ReDim V(2 To UBound(vSrc, 1))
For J = 1 To UBound(vSrc, 2)
    If Not dD.Exists(vSrc(1, J)) Then 'set new dictionary item
        For I = 2 To UBound(vSrc, 1)
            V(I) = vSrc(I, J)
        Next I
        dD.Add Key:=vSrc(1, J), Item:=V
    Else 'combine the columns
        For I = 2 To UBound(vSrc, 1)
            If vSrc(I, J) <> "" Then
                V = dD(vSrc(1, J))
                V(I) = vSrc(I, J)
                dD(vSrc(1, J)) = V
            End If
        Next I
    End If
Next J

'Write results to output array
ReDim vRes(0 To UBound(vSrc, 1) - 1, 1 To dD.Count)

'Headers
J = 0
Dim V1 As Variant
For Each V1 In dD.Keys
    J = J + 1
    vRes(0, J) = V1
Next V1

'Data
For J = 1 To UBound(vRes, 2)
    I = 0
    For Each V1 In dD(vRes(0, J))
        I = I + 1
        vRes(I, J) = V1
    Next V1
Next J

'write the results
Set rRes = rRes.Resize(UBound(vRes, 1) + 1, UBound(vRes, 2))
With rRes
    .EntireColumn.Clear
    .Value = vRes
    With .Rows(1)
        .Font.Bold = True
        .HorizontalAlignment = xlCenter
    End With
    .EntireColumn.AutoFit
End With

End Sub

Private Function LastRowCol(Worksht As String) As Long()
    Dim WS As Worksheet, R As Range
    Dim LastRow As Long, LastCol As Long
    Dim L(1) As Long
Set WS = Worksheets(Worksht)
With WS
    Set R = .Cells.Find(what:="*", after:=.Cells(1, 1), _
                    LookIn:=xlValues, searchorder:=xlByRows, _
                    searchdirection:=xlPrevious)

    If Not R Is Nothing Then
        LastRow = R.Row
        LastCol = .Cells.Find(what:="*", after:=.Cells(1, 1), _
                    LookIn:=xlValues, searchorder:=xlByColumns, _
                    searchdirection:=xlPrevious).Column
    Else
        LastRow = 1
        LastCol = 1
    End If
End With

L(0) = LastRow
L(1) = LastCol
LastRowCol = L
End Function

原始数据

enter image description here

<强>联合

enter image description here

答案 2 :(得分:0)

PowerQuery是迄今为止最好的工具,因为你可以在几分钟内完成一个解决方案,而不会消耗太多的脑力。

但是为了完整性,这里有一个VBA解决方案可以做你想要的,并且还处理两个以上的重复列。它假定这些列始终位于旁边,与样本数据一样。

这需要我在30分钟到60分钟之间进行整理和排除故障,因为我试图优化和处理您删除的列需要一些思考。相比之下,在PQ中组合解决方案可能只需要几分钟。这就是为什么我投票赞成@virtualdvid采用的方法。在效率和稳健性方面,我的方法不像下面的Rick's Dictionary方法那样快速或强大。与PQ或词典相比,这匹马将遥遥领先。

Sub Test()

Dim lLastRow As Long
Dim lLastCol As Long
Dim i As Long
Dim j As Long
Dim k As Long
Dim sHeader1 As String
Dim sHeader2 As String

lLastCol = ActiveSheet.Cells(1, ActiveSheet.Columns.Count).End(xlToLeft).Column

For i = lLastCol To 1 Step -1
    sHeader1 = Cells(1, i)
    For j = i - 1 To 1 Step -1
        sHeader2 = Cells(1, j)
        If sHeader2 <> sHeader1 Then Exit For
        If sHeader1 = sHeader2 Then
            lLastRow = ActiveSheet.Cells(ActiveSheet.Rows.Count, i).End(xlUp).Row
            For k = 2 To lLastRow
                If Cells(k, i).Value <> "" Then
                    Cells(k, j).Value = Cells(k, i).Value
                End If
            Next k
            Columns(i).Delete Shift:=xlToLeft
        End If
    Next j
Next i

End Sub

请注意,这不是最佳VBA。您可以通过关闭screenupdating进一步优化此功能。更好的是,不要这么做,而是将所有数据一次性地拉入变体数组中,然后使用类似的代码进行合并,然后一次性将其转储回工作表中。甚至比这更好的是类似于里克斯的词典方法。

答案 3 :(得分:-1)

这可能对您有用:

Sub Test()

Dim lastcol As Long, lastrow As Long, lastrow2 As Long, i As Long, j As Long, k As Long

lastcol = ActiveSheet.Cells(1, ActiveSheet.Columns.Count).End(xlToLeft).Column

For i = 1 To lastcol
    For j = i To lastcol
        If Cells(1, i).Value = Cells(1, j).Value And i <> j Then 'Merge em
            lastrow = ActiveSheet.Cells(ActiveSheet.Rows.Count, i).End(xlUp).Row
            lastrow2 = ActiveSheet.Cells(ActiveSheet.Rows.Count, j).End(xlUp).Row

            If lastrow < lastrow2 Then
                lastrow = lastrow2
            End If

                For k = 2 To lastrow
                    If Cells(k, j).Value <> "" Then
                        Cells(k, i).Value = Cells(k, j).Value
                    End If
                Next k

                Columns(j).Delete Shift:=xlToLeft
                Exit For
        End If
    Next j
Next i

End Sub

测试数据:

TestData

关于您的测试数据:

TestData2

不确定为什么我的照片没有通过......对不起。编辑:看起来现在正在工作。