我有以下data.table,dtgrouped2
:
MonthNo Unique Total
1: 1 AAA 10
2: 1 BBB 0
3: 2 CCC 3
4: 2 DDD 0
5: 3 AAA 0
6: 3 BBB 35
7: 4 CCC 15
8: 4 AAA 0
9: 5 BBB 60
10: 5 CCC 0
11: 6 DDD 100
12: 6 AAA 0
另一张表dt2
:
Unique1 StartDate EndDate Amount1 Amount2
1: AAA 0 1 7 0
3: AAA 1 2 5 0
2: AAA 2 4 3 2
我希望根据以下逻辑将dt2
中的Amount1和Amount2插入dtgrouped2
;"唯一"评估了dtgrouped2
的每一行:
filter(StartDate< MonthNo & EndDate>=MonthNo)
then MAX(EndDate)
then insert Amount1 as Amount1 and Amount2 as Amount2
因此,您可以看到结果因行而异。这将是预期的输出:
Date MonthNo Unique Items Amounts Amount1 Amount2
Jan 1 AAA x 10 7 0
Jan 1 BBB y 2 NA NA
Feb 2 CCC x 3 NA NA
Feb 2 DDD y 15 NA NA
March 3 AAA y 20 3 2
March 3 BBB x 35 NA NA
April 4 CCC x 15 NA NA
April 4 AAA y 50 3 2
May 5 BBB x 60 NA NA
May 5 CCC y 70 NA NA
June 6 DDD x 100 NA NA
June 6 AAA y 20 NA NA
答案 0 :(得分:5)
我建议使用非等联接与mult = "last"
结合使用(以便仅捕获最新的EndDate
)
dtgrouped2[, c("Amount1", "Amount2") := # Assign the below result to new columns in dtgrouped2
dt2[dtgrouped2, # join
.(Amount1, Amount2), # get the column you need
on = .(Unique1 = Unique, # join conditions
StartDate < MonthNo,
EndDate >= MonthNo),
mult = "last"]] # get always the latest EndDate
dtgrouped2
# MonthNo Unique Total Amount1 Amount2
# 1: 1 AAA 10 7 0
# 2: 1 BBB 0 NA NA
# 3: 2 CCC 3 NA NA
# 4: 2 DDD 0 NA NA
# 5: 3 AAA 0 3 2
# 6: 3 BBB 35 NA NA
# 7: 4 CCC 15 NA NA
# 8: 4 AAA 0 3 2
# 9: 5 BBB 60 NA NA
# 10: 5 CCC 0 NA NA
# 11: 6 DDD 100 NA NA
# 12: 6 AAA 0 NA NA
您需要首先加入dt2[dtgrouped]
(而不是相反)的原因是因为您希望为dt2
中的每个可能值加入dtgrouped
,因此允许多个dt2
中要加入dtgrouped