我有两个数据表,如下所示
DT1
库(data.table) 库(lubridate)
DT1<-data.frame(
id=c(7,7,7,3,3,3),
start_time=c("2017-11-01 08:37:35","2017-11-01 09:07:44","2017-11-01 09:46:16","2017-11-01 10:32:29","2017-11-01 10:59:25","2017-11-01 13:24:12"),
cube=c(628,625,469,711,376,628)
)
DT1=data.table(DT1)
id start_time cube
1: 7 2017-11-01 08:37:35 628
2: 7 2017-11-01 09:07:44 625
3: 7 2017-11-01 09:46:16 469
4: 3 2017-11-01 10:32:29 711
5: 3 2017-11-01 10:59:25 376
6: 3 2017-11-01 13:24:12 628
DT2
DT2<-data.frame(
id=c(7,7,7,3,3,3),
res_time=c("2017-11-01 08:35:30","2017-11-01 09:07:48","2017-11-01 09:46:32","2017-11-01 10:31:29","2017-11-01 10:57:25","2017-11-01 13:22:10"),
res_cube=c(309,625,469,712,375,630)
)
DT2=data.table(DT2)
id res_time res_cube
1: 7 2017-11-01 08:35:30 309
2: 7 2017-11-01 09:07:48 625
3: 7 2017-11-01 09:46:32 469
4: 3 2017-11-01 10:31:29 712
5: 3 2017-11-01 10:57:25 375
6: 3 2017-11-01 13:22:10 630
从这两个我需要加入最接近的res_time
DT1行并关联所有DT1行,并附加最接近的res_time
和res_cube
以及res_time
对于每个id组。所以我试过
DT1 = DT1[,start_time := as.character(start_time)]
DT2 = DT2[,res_time := as.character(res_time)]
DT1 = DT1[,start_time := parse_date_time2(start_time,orders="YmdHMS",tz="NA")]
DT2 = DT2[,res_time := parse_date_time2(res_time,orders="YmdHMS",tz="NA")]
setkeyv(DT1, c("id","start_time"))
setkeyv(DT2, c("id","res_time"))
ans = DT1[DT2, roll=Inf]
但这给了这样的东西
id start_time cube res_cube
1: 7 2017-11-01 08:35:30 NA 309
2: 7 2017-11-01 09:07:48 NA 625
3: 7 2017-11-01 09:46:32 NA 469
4: 3 2017-11-01 10:31:29 NA 712
5: 3 2017-11-01 10:57:25 NA 375
6: 3 2017-11-01 13:22:10 NA 630
我没有得到res_time
,我认为我在滚动连接中做错了。
我还注意到我在连接结果的start_time列中获得了res_time。
感谢任何帮助。
答案 0 :(得分:1)
我设法通过前滚获得了结果 而不是滚动落后
setDT(DT1)
setDT(DT2)
DT1 = DT1[,start_time := parse_date_time2(start_time,orders="YmdHMS",tz="NA")]
DT2 = DT2[,res_time := parse_date_time2(res_time,orders="YmdHMS",tz="NA")]
DT1 [,time:=start_time]
DT2[,time:=res_time]
setkey(DT1 ,id,time)
setkey(DT2,id,time)
ans = DT2[DT1, roll=T]# perform a Rolling Forward where Each start_time in DT1 is matched to the closest res_time in DT2.
<强> ANS 强>
id res_time cube time start_time i.cube
1: 3 2017-11-01 10:31:29 712 2017-11-01 10:32:29 2017-11-01 10:32:29 711
2: 3 2017-11-01 10:57:25 375 2017-11-01 10:59:25 2017-11-01 10:59:25 376
3: 3 2017-11-01 13:22:10 630 2017-11-01 13:24:12 2017-11-01 13:24:12 628
4: 7 2017-11-01 08:35:30 309 2017-11-01 08:37:35 2017-11-01 08:37:35 628
5: 7 2017-11-01 08:35:30 309 2017-11-01 09:07:44 2017-11-01 09:07:44 625
6: 7 2017-11-01 09:07:48 625 2017-11-01 09:46:16 2017-11-01 09:46:16 469
希望有所帮助