合并具有连接日期的行

时间:2015-04-18 09:52:50

标签: excel matlab excel-vba excel-formula vba

我有一张包含客户和订阅数据的大型Excel表格。从这个表我想合并记录/行与连接stop_和start_dates并在新的工作表中显示结果。简化版数据如下所示。

Customer_id subscription_id   start_date    stop_date
1034        RV4               30-4-2012     30-1-2015
1035        AB7               30-1-2014     30-3-2014
1035        AB6               30-1-2014     30-3-2014
1035        AB7               30-12-2013    30-1-2014
1035        AB7               12-12-2012    30-12-2013
1035        AB7               12-9-2010     14-1-2011

因此,公式必须检查customer_id和subscription_id。如果工作表中的两行或多行之间存在匹配,并且其中一行的stop_date与另一行的start_date重叠,则在提取和合并之后,必须使用第一行的start_date显示一个新行。另一行的stop_date。如果有多个连接日期的行,这也必须工作。在提取之后,所有不符合这些标准的行保持不变。所以结果将是这样的:

Customer_id subscription_id start_date  stop_date
1034        RV4             30-4-2012   30-1-2015
1035        AB6             30-1-2014   30-3-2014
1035        AB7             12-12-2012  30-3-2014
1035        AB7             12-9-2010   14-1-2011

动态解决方案将是理想的,而新数据将添加到原始工作表中。虽然我知道当你确定你要寻找的行总是在彼此之下时,这是可能的,但这不是这里的情况,它不会给你一个非常动态的解决方案。

因此我想在Excel中需要某种类型的数组函数,但经过大量搜索后我无法找到合适的解决方案。我也有MATLAB可用,但不知道该程序从哪里开始出现这样的问题。

1 个答案:

答案 0 :(得分:0)

动态解决方案可能可能,但如果数据集很大,它可能会使事情陷入困境,因为每次更改单元格时都需要它运行。

基本上我能看到的最好方法是在customer_id和subscription_id中创建唯一的密钥,然后收集该密钥下的所有日期范围并合并它们。

这样的事情应该让你入门(需要引用Microsoft Scripting Runtime):

Public Sub LinkSubscriptionDates()

    Dim data As Dictionary, source As Worksheet, target As Worksheet

    Set source = ActiveSheet
    Set data = GetSubscriptions(source)
    Set target = source.Parent.Worksheets.Add

    'Copy headers
    target.Range(target.Cells(1, 1), target.Cells(1, 4)).Value = _
           source.Range(source.Cells(1, 1), source.Cells(1, 4)).Value

    Dim row As Long
    row = 2

    Dim key As Variant, item As Variant
    For Each key In data.Keys
        For Each item In data(key)
            target.Cells(row, 1) = Split(key, "|")(0)
            target.Cells(row, 2) = Split(key, "|")(1)
            target.Cells(row, 3) = Split(item, "|")(0)
            target.Cells(row, 4) = Split(item, "|")(1)
            row = row + 1
        Next item
    Next key

End Sub

Private Function GetSubscriptions(source As Worksheet) As Dictionary

    Dim subscrips As Dictionary
    Set subscrips = New Dictionary

    Dim row As Long
    Dim cust As String, subs As String, starting As String, ending As String

    'Gather all the data as pairs of customer|subscription, starting|ending
    For row = 2 To source.UsedRange.Rows.Count
        Dim dates() As String
        cust = source.Cells(row, 1).Value
        subs = source.Cells(row, 2).Value
        'Valid customer/subscription?
        If cust <> vbNullString And subs <> vbNullString Then
            starting = source.Cells(row, 3).Value
            ending = source.Cells(row, 4).Value
            'Has an ending and starting date?
            If starting <> vbNullString And ending <> vbNullString Then
                Dim key As String
                key = cust & "|" & subs
                'New combo?
                If Not subscrips.Exists(key) Then
                    subscrips.Add key, New Collection
                    subscrips(key).Add starting & "|" & ending
                Else
                    subscrips(key).Add starting & "|" & ending
                    Set subscrips(key) = MergeDates(subscrips(key))
                End If
            End If
        End If
    Next row

    Set GetSubscriptions = subscrips

End Function

Private Function MergeDates(dates As Collection) As Collection

    Dim candidate As Long, index As Long
    Dim values() As String, test() As String
    Dim merge As Boolean

    For index = 1 To dates.Count
        values = Split(dates(index), "|")
        'Check to see if it can be merged with any other row.
        For candidate = index + 1 To dates.Count
            test = Split(dates(candidate), "|")
            If CDate(test(0)) >= CDate(values(0)) And _
               CDate(test(0)) <= CDate(values(1)) Or _
               CDate(test(1)) >= CDate(values(0)) And _
               CDate(test(1)) <= CDate(values(1)) Then
                dates.Remove candidate
                merge = True
                Exit For
            End If
        Next candidate
        If merge Then Exit For
    Next index

    If merge Then
        'Pull both rows out of the collection.
        dates.Remove index
        values(0) = IIf(CDate(test(0)) < CDate(values(0)), _
                        CDate(test(0)), CDate(values(0)))
        values(1) = IIf(CDate(test(1)) > CDate(values(1)), _
                        CDate(test(1)), CDate(values(1)))
        'Put the merged date range back in.
        dates.Add values(0) & "|" & values(1)
        'Recurse.
        Set MergeDates = MergeDates(dates)
    End If

    Set MergeDates = dates

End Function

它确实需要通过数据验证,错误捕获等来充实,并且它当前只是将结果数据放在新工作表上。所有工作都在GetSubscriptions函数中完成,因此您只需从中获取返回的Dictionary,并对其中的数据执行任何操作。