将行与重复值组合,合并单元格(如果不同

时间:2015-05-24 07:15:37

标签: excel vba excel-vba

我有类似的问题 [将行与重复值组合] [1] Excel VBA - Combine rows with duplicate values in one cell and merge values in other cell

我有这种格式的数据(行已排序)

Pub     ID      CH      Ref
no      15      1      t2
no      15      1      t88
yes     15      2      t3
yes     15      2      t3
yes     15      2      t6

比较相邻的行(比如第4行和第5行),如果第2列和第3列匹配,那么如果第4列不同,则合并col4,删除行。如果col 2,3,4匹配然后删除行,则不要合并col 4

期望输出

key     ID      CH      Text  
no      15      1       t2   t88
yes     15      2       t3   t6

此第一个代码部分无法正常工作

Sub mergeCategoryValues()
    Dim lngRow As Long

    With ActiveSheet
        Dim columnToMatch1 As Integer: columnToMatch1 = 2
        Dim columnToMatch2 As Integer: columnToMatch2 = 3
        Dim columnToConcatenate As Integer: columnToConcatenate = 4


        lngRow = .Cells(65536, columnToMatch1).End(xlUp).row
        .Cells(columnToMatch1).CurrentRegion.Sort key1:=.Cells(columnToMatch1), Header:=xlYes
        .Cells(columnToMatch2).CurrentRegion.Sort key1:=.Cells(columnToMatch2), Header:=xlYes

        Do
            If .Cells(lngRow, columnToMatch1) = .Cells(lngRow - 1, columnToMatch1) Then 'check col 2 row lngRow, lngRow-1
              If .Cells(lngRow, columnToMatch2) = .Cells(lngRow - 1, columnToMatch2) Then 'check col 3 row lngRow, lngRow-1
                 If .Cells(lngRow - 1, columnToConcatenate) = .Cells(lngRow, columnToConcatenate) Then
                    Else
                    .Cells(lngRow - 1, columnToConcatenate) = .Cells(lngRow - 1, columnToConcatenate) & "; " & .Cells(lngRow, columnToConcatenate)
                 End If
                .Rows(lngRow).Delete
              End If
            End If
            lngRow = lngRow - 1
        Loop Until lngRow = 1
    End With

实际输出不正确,因为当单元格合并时t3将不匹配t3; t6,我对col 4的比较仅适用于非常简单的情况。

实际输出

key ID  CH  Text
no  15  1   t2; t88
yes 15  2   t3; t3; t6

因此,我必须添加这两个部分来拆分Concatenate单元格,然后删除重复项

'split cell in Col d to col e+ delimited by ;
        With Range("D2:D6", Range("D" & Rows.Count).End(xlUp))
            .Replace ";", " ", xlPart
            .TextToColumns other:=True
        End With

 'remove duplicates in each row

    Dim x, y(), i&, j&, k&, s$
    With ActiveSheet.UsedRange
        x = .Value: ReDim y(1 To UBound(x, 1), 1 To UBound(x, 2))
        For i = 1 To UBound(x)
            For j = 1 To UBound(x, 2)
                If Len(x(i, j)) Then
                    If InStr(s & "|", "|" & x(i, j) & "|") = 0 Then _
                       s = s & "|" & x(i, j): k = k + 1: y(i, k) = x(i, j)
                End If
            Next j: s = vbNullString: k = 0
        Next i
        .Value = y()
    End With
    End Sub

附加代码输出

Pub ID  CH  Ref 
no  15  1   t2  t88
yes 15  2   t3  t6

问题:使用三种不同的方法必须有更简单的方法吗?如果col 4项目不匹配,如何插入新列5+?

注意:从excelforum的用户nilem中找到了删除重复代码。

编辑:如果第2列和第3列匹配,则第1列将始终相同。如果解决方案更容易,我们可以假设Col 1为空并忽略数据。

我有打印的书籍查找表,需要转换成一种简单的格式,用于使用1960年语言的设备,该语言的命令非常有限。我正在尝试预先格式化这些数据,所以我只需要搜索包含所有信息的一行。

Col D最终输出可以在带分隔符的col D或col D-K(仅8最大Ref)中,因为我将解析为在其他机器上使用。无论什么方法都比较容易。

3 个答案:

答案 0 :(得分:1)

删除行的规范做法是从底部开始,向顶部工作。以这种方式,不跳过行。这里的技巧是在当前位置上方找到与列B和C匹配的行,并在删除行之前连接D列中的字符串。有几个好的工作表公式可以获取两列匹配的行号。将其中一个用application.Evaluate付诸实践似乎是从D列收集值的最有效的方法。

Sub dedupe_and_collect()
    Dim rw As Long, mr As Long, wsn As String

    With ActiveSheet   '<- set this worksheet reference properly!
        wsn = .Name
        With .Cells(1, 1).CurrentRegion
            .RemoveDuplicates Columns:=Array(2, 3, 4), Header:=xlYes
        End With
        With .Cells(1, 1).CurrentRegion  'redefinition after duplicate removal
            For rw = .Rows.Count To 2 Step -1 'walk backwards when deleting rows
                If Application.CountIfs(.Columns(2), .Cells(rw, 2).Value, .Columns(3), .Cells(rw, 3).Value) > 1 Then
                    mr = Application.Evaluate("MIN(INDEX(ROW(1:" & rw & ")+(('" & wsn & "'!B1:B" & rw & "<>'" & wsn & "'!B" & rw & ")+('" & wsn & "'!C1:C" & rw & "<>'" & wsn & "'!C" & rw & "))*1E+99, , ))")
                    'concatenate column D
                    '.Cells(mr, 4) = .Cells(mr, 4).Value & "; " & .Cells(rw, 4).Value
                    'next free column from column D
                    .Cells(mr, Columns.Count).End(xlToLeft).Offset(0, 1) = .Cells(rw, 4).Value
                    .Rows(rw).EntireRow.Delete
                End If
            Next rw
        End With
    End With
End Sub

使用VBA等效于日期►数据工具►删除重复项命令,删除三列匹配的记录。这仅考虑列B,C和D并删除较低的重复项(保持最接近行1的那些)。如果A列在这方面很重要,则必须增加额外的编码。

我不清楚您是想将D列作为分隔字符串还是将单独的单元格作为最终结果。你能澄清一下吗?

答案 1 :(得分:1)

正如我上面所写,我将遍历数据并将内容收集到用户定义的对象中。不需要在此方法中对数据进行排序;并且将省略重复REF&#39; s。

用户定义对象的一个​​优点是,它可以使调试更容易,因为您可以更清楚地看到您所做的事情。

我们通过使用Collection对象的属性来组合IDCH相同的每一行,如果使用相同的键,则会引发错误。

只要将单个单元格中的Ref与分隔符组合,与D:K列中的单个单元格相结合,可以简单地完成。我选择分成列,但将其更改为合并为一列将是微不足道的。

插入课程模块后,您必须重命名:cID_CH

您会注意到我将结果放在单独的工作表上。你可以覆盖原始数据,但我会建议不要这样做。

班级单元

Option Explicit
Private pID As Long
Private pCH As Long
Private pPUB As String
Private pREF As String
Private pcolREF As Collection

Public Property Get ID() As Long
    ID = pID
End Property
Public Property Let ID(Value As Long)
    pID = Value
End Property

Public Property Get CH() As Long
    CH = pCH
End Property
Public Property Let CH(Value As Long)
    pCH = Value
End Property

Public Property Get PUB() As String
    PUB = pPUB
End Property
Public Property Let PUB(Value As String)
    pPUB = Value
End Property

Public Property Get REF() As String
    REF = pREF
End Property
Public Property Let REF(Value As String)
    pREF = Value
End Property

Public Property Get colREF() As Collection
    Set colREF = pcolREF
End Property

Public Sub ADD(refVAL As String)
    On Error Resume Next
        pcolREF.ADD refVAL, refVAL
    On Error GoTo 0
End Sub

Private Sub Class_Initialize()
    Set pcolREF = New Collection
End Sub

常规模块

Option Explicit
Sub CombineDUPS()
    Dim wsSRC As Worksheet, wsRES As Worksheet
    Dim vSRC As Variant, vRES() As Variant, rRES As Range
    Dim cI As cID_CH, colI As Collection
    Dim I As Long, J As Long
    Dim S As String

'Set source and results worksheets and results range
Set wsSRC = Worksheets("sheet1")
Set wsRES = Worksheets("sheet2")
Set rRES = wsRES.Cells(1, 1)

'Get Source data
With wsSRC
    vSRC = .Range("A2", .Cells(.Rows.Count, "D").End(xlUp))
End With

'Collect and combine data
Set colI = New Collection
On Error Resume Next
For I = 1 To UBound(vSRC, 1)
    Set cI = New cID_CH
    With cI
        .PUB = vSRC(I, 1)
        .ID = vSRC(I, 2)
        .CH = vSRC(I, 3)
        .REF = vSRC(I, 4)
        .ADD .REF
        S = CStr(.ID & "|" & .CH)
        colI.ADD cI, S
        If Err.Number = 457 Then
            Err.Clear
            colI(S).ADD .REF
        ElseIf Err.Number <> 0 Then
            Debug.Print Err.Number, Err.Description
            Stop
        End If
    End With
Next I
On Error GoTo 0

'Create and populate Results Array
ReDim vRES(0 To colI.Count, 1 To 11)

'Header row
vRES(0, 1) = "Pub"
vRES(0, 2) = "ID"
vRES(0, 3) = "CH"
vRES(0, 4) = "Ref"

'populate array
For I = 1 To colI.Count
    With colI(I)
        vRES(I, 1) = .PUB
        vRES(I, 2) = .ID
        vRES(I, 3) = .CH
        For J = 1 To .colREF.Count
            vRES(I, J + 3) = .colREF(J)
        Next J
    End With
Next I

'Write the results to the worksheet
Set rRES = rRES.Resize(UBound(vRES, 1) + 1, UBound(vRES, 2))
With rRES
    .EntireColumn.Clear
    .Value = vRES
    With .Rows(1)
        .Font.Bold = True
        .HorizontalAlignment = xlCenter
        Range(.Cells(4), .Cells(11)).HorizontalAlignment = xlCenterAcrossSelection
    End With
    .EntireColumn.AutoFit
End With

End Sub

<强>原始

Original Data

已处理结果

Results

答案 2 :(得分:1)

变体使用下面的字典

Sub test()
    Dim Dic As Object: Set Dic = CreateObject("Scripting.Dictionary")
    Dic.Comparemode = vbTextCompare
    Dim Cl As Range, x$, y$, i&, Key As Variant
    For Each Cl In Range("B2:B" & Cells(Rows.Count, "B").End(xlUp).Row)
        x = Cl.Value & "|" & Cl.Offset(, 1).Value
        y = Cl.Offset(, 2).Value
        If Not Dic.exists(x) Then
            Dic.Add x, Cl.Offset(, -1).Value & "|" & y & "|"
        ElseIf Dic.exists(x) And Not LCase(Dic(x)) Like "*|" & LCase(y) & "|*" Then
            Dic(x) = Dic(x) & "|" & y & "|"
        End If
    Next Cl
    Range("A2:D" & Cells(Rows.Count, "B").End(xlUp).Row).ClearContents
    i = 2
    For Each Key In Dic
        Cells(i, "A") = Split(Dic(Key), "|")(0)
        Range(Cells(i, "B"), Cells(i, "C")) = Split(Key, "|")
        Cells(i, "D") = Replace(Split(Replace(Dic(Key), "||", ";"), "|")(1), ":", ";")
        i = i + 1
    Next Key
    Set Dic = Nothing
End Sub

<强>前

enter image description here

<强>后

enter image description here