我有2张桌子,如下所示。在表1中,“开始”和“结束”为每个键定义了一段时间。
任务是将表2中的时间戳与第一张表的相应时间段进行匹配,检索关联的键并将其分配给时间戳。
到目前为止,我发现between()::lubridate
对测试时间戳是否在该时段内很有用。 between()
的输出是逻辑向量。我认为如果值为TRUE,代码就可以工作,但是如果时间戳不匹配任何周期(即值为FALSE),则代码会失败。
有人知道如何解决吗?
# generate tables
Keys = c("F11-47" , "F11-49" , "F11-66" )
Start = c("2018-01-15 11:35:00" ,"2018-01-23 12:05:00" , "2018-10-09 11:44:00" )
End = c("2018-01-23 04:05:00", "2018-05-15 13:32:03", "2018-12-10 05:06:00")
table1 = as.data.frame(cbind(Keys, Start, End))
table1$Start = ymd_hms(table1$Start) # parse to POSIX
table1$End = ymd_hms(table1$End) # parse to POSIX
timestamps = c("2018-01-16 11:37:00", "2019-04-26 16:13:05" , "2018-01-19 15:35:00", "2018-01-23 12:05:00", "2018-01-24 12:05:00" ,"2018-02-24 12:05:00" ,
"2018-03-23 12:15:00", "2017-10-03 14:11:01" , "2018-04-07 14:15:00", "2018-10-17 14:15:00" , "2018-11-01 5:33:16", "2019-03-26 16:18:27" )
table2 = as.data.frame(timestamps)
table2$Keys = ""
table2$timestamps = ymd_hms(table2$timestamps) # parse to POSIX
# what I've done so far
for (i in 1:length(table2$timestamps)) {
timestamp = table2$timestamps[i]
for (j in 1:length(table1$Keys)) {
if (between(timestamp, table1$Start[j], table1$End[j])) { # test if timestamp is between a time period
expkey = table1$Exp_Keys[j] # retrieve Key from that time period
}
}
table2$Keys[i] = expkey # assign key to timestamp
}
答案 0 :(得分:1)
在指示的条件下执行左联接:
library(sqldf)
sqldf("select t2.timestamps, t1.Keys
from table2 t2
left join table1 t1 on t2.timestamps between t1.Start and t1.End")
给予:
timestamps Keys
1 2018-01-16 06:37:00 F11-47
2 2019-04-26 12:13:05 <NA>
3 2018-01-19 10:35:00 F11-47
4 2018-01-23 07:05:00 F11-49
5 2018-01-24 07:05:00 F11-49
6 2018-02-24 07:05:00 F11-49
7 2018-03-23 08:15:00 F11-49
8 2017-10-03 10:11:01 <NA>
9 2018-04-07 10:15:00 F11-49
10 2018-10-17 10:15:00 F11-66
11 2018-11-01 01:33:16 F11-66
12 2019-03-26 12:18:27 <NA>