我有一张包含客户和订阅数据的大型Excel表格。从这个表我想合并记录/行与连接stop_和start_dates并在新的工作表中显示结果。简化版数据如下所示。
Customer_id subscription_id start_date stop_date
1034 RV4 30-4-2012 30-1-2015
1035 AB7 30-1-2014 30-3-2014
1035 AB6 30-1-2014 30-3-2014
1035 AB7 30-12-2013 30-1-2014
1035 AB7 12-12-2012 30-12-2013
1035 AB7 12-9-2010 14-1-2011
因此,公式必须检查customer_id和subscription_id。如果工作表中的两行或多行之间存在匹配,并且其中一行的stop_date与另一行的start_date重叠,则在提取和合并之后,必须使用第一行的start_date显示一个新行。另一行的stop_date。如果有多个连接日期的行,这也必须工作。在提取之后,所有不符合这些标准的行保持不变。所以结果将是这样的:
Customer_id subscription_id start_date stop_date
1034 RV4 30-4-2012 30-1-2015
1035 AB6 30-1-2014 30-3-2014
1035 AB7 12-12-2012 30-3-2014
1035 AB7 12-9-2010 14-1-2011
动态解决方案将是理想的,而新数据将添加到原始工作表中。虽然我知道当你确定你要寻找的行总是在彼此之下时,这是可能的,但这不是这里的情况,它不会给你一个非常动态的解决方案。
因此我想在Excel中需要某种类型的数组函数,但经过大量搜索后我无法找到合适的解决方案。我也有MATLAB可用,但不知道该程序从哪里开始出现这样的问题。
答案 0 :(得分:0)
动态解决方案可能可能,但如果数据集很大,它可能会使事情陷入困境,因为每次更改单元格时都需要它运行。
基本上我能看到的最好方法是在customer_id和subscription_id中创建唯一的密钥,然后收集该密钥下的所有日期范围并合并它们。
这样的事情应该让你入门(需要引用Microsoft Scripting Runtime):
Public Sub LinkSubscriptionDates()
Dim data As Dictionary, source As Worksheet, target As Worksheet
Set source = ActiveSheet
Set data = GetSubscriptions(source)
Set target = source.Parent.Worksheets.Add
'Copy headers
target.Range(target.Cells(1, 1), target.Cells(1, 4)).Value = _
source.Range(source.Cells(1, 1), source.Cells(1, 4)).Value
Dim row As Long
row = 2
Dim key As Variant, item As Variant
For Each key In data.Keys
For Each item In data(key)
target.Cells(row, 1) = Split(key, "|")(0)
target.Cells(row, 2) = Split(key, "|")(1)
target.Cells(row, 3) = Split(item, "|")(0)
target.Cells(row, 4) = Split(item, "|")(1)
row = row + 1
Next item
Next key
End Sub
Private Function GetSubscriptions(source As Worksheet) As Dictionary
Dim subscrips As Dictionary
Set subscrips = New Dictionary
Dim row As Long
Dim cust As String, subs As String, starting As String, ending As String
'Gather all the data as pairs of customer|subscription, starting|ending
For row = 2 To source.UsedRange.Rows.Count
Dim dates() As String
cust = source.Cells(row, 1).Value
subs = source.Cells(row, 2).Value
'Valid customer/subscription?
If cust <> vbNullString And subs <> vbNullString Then
starting = source.Cells(row, 3).Value
ending = source.Cells(row, 4).Value
'Has an ending and starting date?
If starting <> vbNullString And ending <> vbNullString Then
Dim key As String
key = cust & "|" & subs
'New combo?
If Not subscrips.Exists(key) Then
subscrips.Add key, New Collection
subscrips(key).Add starting & "|" & ending
Else
subscrips(key).Add starting & "|" & ending
Set subscrips(key) = MergeDates(subscrips(key))
End If
End If
End If
Next row
Set GetSubscriptions = subscrips
End Function
Private Function MergeDates(dates As Collection) As Collection
Dim candidate As Long, index As Long
Dim values() As String, test() As String
Dim merge As Boolean
For index = 1 To dates.Count
values = Split(dates(index), "|")
'Check to see if it can be merged with any other row.
For candidate = index + 1 To dates.Count
test = Split(dates(candidate), "|")
If CDate(test(0)) >= CDate(values(0)) And _
CDate(test(0)) <= CDate(values(1)) Or _
CDate(test(1)) >= CDate(values(0)) And _
CDate(test(1)) <= CDate(values(1)) Then
dates.Remove candidate
merge = True
Exit For
End If
Next candidate
If merge Then Exit For
Next index
If merge Then
'Pull both rows out of the collection.
dates.Remove index
values(0) = IIf(CDate(test(0)) < CDate(values(0)), _
CDate(test(0)), CDate(values(0)))
values(1) = IIf(CDate(test(1)) > CDate(values(1)), _
CDate(test(1)), CDate(values(1)))
'Put the merged date range back in.
dates.Add values(0) & "|" & values(1)
'Recurse.
Set MergeDates = MergeDates(dates)
End If
Set MergeDates = dates
End Function
它确实需要通过数据验证,错误捕获等来充实,并且它当前只是将结果数据放在新工作表上。所有工作都在GetSubscriptions
函数中完成,因此您只需从中获取返回的Dictionary,并对其中的数据执行任何操作。