Excel VBA splititng data into colums without duplicating

时间:2017-04-10 02:57:40

标签: excel vba excel-vba

I'm trying to create a macro that goes through a huge list of data and that would split into columns parts of the data according to some criteria. Please not that there is no pattern that can be hard coded since in the example below the number of dependants can change drastically.

As of now, my data looks like this. enter image description here

I would want it like this enter image description here

I've written some code that goes through everything, and write each dependent on a new column but that create duplicates. I'm not able to detect whether that value was already on the row or not. Here is the code that I tried to write. Take not that we're already in the context of looping through every row.

dim isUnique 
            For i = 1 To 100
                If  not WsT.Cells(Rt, i).Value =  .Cells(NbcDepfname).Value Then 
                    isUnique = true
                else 
                    isUnique = false
                End If
            Next i
              Ct = 10

              If isUnique Then
                  WsT.Cells(Rt, Ct).Value = .Cells(NbcDepLname).Value
                  WsT.Cells(Rt, Ct + 1).Value = .Cells(NbcDepfname).Value
                  WsT.Cells(Rt, Ct + 2).Value = .Cells(NbcDepBDate).Value

            End If
        End With

I have very little experience with VBA and macros so my approach might not be the best or good at all. I've noticed also huge performance drops with that approach but it's alright for that project.

**EDIT ****** I understand my mistake perfectly. I'm looping through "EVERY" row of the first sheet with the unformatted data. I don't check correctly if the data is unique for that row before writing it, therefore, It will obviously keep writing a dependant per row. I tried with dictionaries but since it is possible that a dependant has both the same name and birthdate as another one on another row with different parents i couldn't keep that solution. The detection has to be done at the Row level.

2 个答案:

答案 0 :(得分:0)

我突然想到你可能有两个同姓的家属,配偶和孩子。因此,我在之前的解决方案中添加了另一个测试以检查DoB。这是扩展版本。

Private Function IsUnique(ByVal DepName As String, _
                          DepDob As String, _
                          Ws As Worksheet, _
                          R As Long) As Boolean
    ' 11 Apr 2017

    Dim Rng As Range
    Dim Dob As Variant
    Dim C As Long

    With Ws.Rows(R)
        Set Rng = Range(.Cells(NedDependt), .Cells(Ws.UsedRange.Columns.Count))
    End With
    DepName = Trim(DepName)

    With Rng
        For C = (.Cells.Count - 1) To 1 Step ((NedDepDob + 1) * -1)
            If StrComp(Trim(.Cells(C).Value), DepName, vbTextCompare) = 0 Then
                Dob = .Cells(C + NedDepDob).Value
                If IsDate(Dob) Then
                    If IsDate(DepDob) Then
                        If CDate(Dob) = CDate(DepDob) Then Exit For
                    Else
                        If Trim(Dob) = Trim(DepDob) Then Exit For
                    End If
                Else
                    If Trim(Dob) = Trim(DepDob) Then Exit For
                End If
            End If
        Next C
    End With

    IsUnique = (C < 1)
End Function

为了这个过程的目的,我创建了一个新的枚举,它取代了之前的Enum Nfd。在您上面发布它们时,新枚举将调整为实际列。如果您在模块以外的模块中引用它,则可能需要它不是Private。

Private Enum Ned                        ' worksheet Employee Data
    NedFirstDataRow = 2                 ' Adjust as required
    NedName = 1                         ' columns:
    NedFname
    NedEmployed
    NedDependt
    NedDepName = 0                       ' Offset from NedDependt
    NedDepDob
End Enum

以下是我用于测试的调用程序。将函数集成到项目中后,工作表应该是包含格式化数据的工作表,而名称,Dob和行是变量。

Private Sub TestIsUnique()
    Debug.Print IsUnique("vanessa", "1/12/1976", ActiveSheet, 9)
    Debug.Print IsUnique("vanessa", "01/02/1976", ActiveSheet, 9)
End Sub

日期有点问题。我注意到你使用yyyy / mm / ddd。代码没有问题,但如果数据中的日期是字符串,则可能会出现问题。我对其进行了编程,以便代码首先尝试比较实际日期,但如果其中一个值无法转换为日期,则会比较字符串。

答案 1 :(得分:0)

我在previous, very similar thread发布的例程的细微更改就是所需要的。该例程已经使用字典来检查重复项,所以这只是在这个线程中适应你稍微不同的布局的问题。

值得注意的是,显然,这需要通过FirstNameLastName + EmploymentDate的组合来识别“主要”人物。但是,它不要求对列表进行排序。

它应该适应任意数量的唯一Dependent Name / BirthDate对。

在编写结果之前,它将清除整个结果工作表。

请务必阅读不同模块中的注释。它们对于运行至关重要。您将使用两个类模块和一个常规模块

1级

Option Explicit
'Rename this module: cDependents
'set reference to Microsoft Scripting Runtime
Private pBirthDt As Date
Private pDepName As String

Public Property Get BirthDt() As Date
    BirthDt = pBirthDt
End Property
Public Property Let BirthDt(Value As Date)
    pBirthDt = Value
End Property

Public Property Get DepName() As String
    DepName = pDepName
End Property
Public Property Let DepName(Value As String)
    pDepName = Value
End Property

2级

Option Explicit
'rename this module: cFamily
'set reference to Microsoft Scripting Runtime
Private pFirstName As String
Private pLastName As String
Private pBirthDate As Date
Private pEmploymentDate As Date
Private pDependents As Dictionary

Public Property Get FirstName() As String
    FirstName = pFirstName
End Property
Public Property Let FirstName(Value As String)
    pFirstName = Value
End Property

Public Property Get LastName() As String
    LastName = pLastName
End Property
Public Property Let LastName(Value As String)
    pLastName = Value
End Property

Public Property Get BirthDate() As Date
    BirthDate = pBirthDate
End Property
Public Property Let BirthDate(Value As Date)
    pBirthDate = Value
End Property

Public Property Get EmploymentDate() As Date
    EmploymentDate = pEmploymentDate
End Property
Public Property Let EmploymentDate(Value As Date)
    pEmploymentDate = Value
End Property

Public Function ADDDependents(BrthDt, Name)
    Dim cD As New cDependents
    Dim sKey As String
    With cD
        .DepName = Name
        .BirthDt = BrthDt
        sKey = .BirthDt & Chr(1) & .DepName
    End With

    If Not pDependents.Exists(sKey) Then
        pDependents.Add Key:=sKey, Item:=cD
    End If
End Function

Public Property Get Dependents() As Dictionary
    Set Dependents = pDependents
End Property


Private Sub Class_Initialize()
    Set pDependents = New Dictionary
End Sub

常规模块

Option Explicit
'set reference to Microsoft Scripting Runtime
Sub Family()
    Dim wsSrc As Worksheet, wsRes As Worksheet, rRes As Range
    Dim vSrc As Variant, vRes As Variant
    Dim dF As Dictionary, cF As cFamily
    Dim I As Long, J As Long
    Dim sKey As String
    Dim V As Variant, W As Variant

'Set source and results worksheets and results range
Set wsSrc = Worksheets("sheet1")
Set wsRes = Worksheets("sheet2")
    Set rRes = wsRes.Cells(1, 1)

'read source data into array
With wsSrc
    vSrc = .Range(.Cells(1, 1), .Cells(.Rows.Count, 1).End(xlUp)).Resize(columnsize:=5)
End With

'Collect and organize the family and dependent objects
Set dF = New Dictionary
For I = 2 To UBound(vSrc, 1)
    Set cF = New cFamily
    With cF
        .LastName = vSrc(I, 1)
        .FirstName = vSrc(I, 2)
        .EmploymentDate = vSrc(I, 3)
        .ADDDependents vSrc(I, 5), vSrc(I, 4)
        sKey = .LastName & .FirstName & .EmploymentDate
        If Not dF.Exists(sKey) Then
            dF.Add Key:=sKey, Item:=cF
        Else
            dF(sKey).ADDDependents vSrc(I, 5), vSrc(I, 4)
        End If
    End With
Next I

'Results will have two columns for each relation
' + three columns at the beginning

'get number of extra columns
    Dim ColCount As Long
    For Each V In dF
        I = dF(V).Dependents.Count
        ColCount = IIf(I > ColCount, I, ColCount)
    Next V
    ColCount = ColCount * 2 + 3


ReDim vRes(0 To dF.Count, 1 To ColCount)
    vRes(0, 1) = "Last Name"
    vRes(0, 2) = "First Name"
    vRes(0, 3) = "Employment Date"

    For J = 4 To UBound(vRes, 2) Step 2
        vRes(0, J) = "Dependent Name "
        vRes(0, J + 1) = "Dependent BirthDate"
    Next J


I = 0
For Each V In dF
    I = I + 1
    With dF(V)
        vRes(I, 1) = .LastName
        vRes(I, 2) = .FirstName
        vRes(I, 3) = .EmploymentDate

        J = 2
        For Each W In .Dependents
            J = J + 2
            With .Dependents(W)
                vRes(I, J) = .DepName
                vRes(I, J + 1) = .BirthDt
            End With
        Next W
    End With
Next V

Set rRes = rRes.Resize(rowsize:=UBound(vRes, 1) + 1, columnsize:=UBound(vRes, 2))
With rRes
    .Worksheet.Cells.Clear
    .Value = vRes
    With .Rows(1)
        .Font.Bold = True
        .Font.Color = vbWhite
        .Interior.Color = vbBlue
        .HorizontalAlignment = xlCenter
    End With
    For I = 3 To .Columns.Count - 1 Step 2
        .Columns(I).NumberFormat = "yyyy-mm-dd"
    Next I
    .EntireColumn.AutoFit
End With

End Sub

运行上述数据时的结果

enter image description here