我有一个日期时间数据框
tdata_df <- data.frame(timestamp=seq(c(ISOdate(2018,4,20)), by = (60*229), length.out = 6))
tdata_df
timestamp
1 2018-04-20 21:00:00
2 2018-04-21 00:49:00
3 2018-04-21 04:38:00
4 2018-04-21 08:27:00
5 2018-04-21 12:16:00
6 2018-04-21 16:05:00
然后我想从这个时间范围表中获得价值
time_range_df <- data.frame(start=c("08:30","11:35","15:10","05:00"),
end=c("11:29","15:09","02:29","08:29"),value=c(1,2,3,4))
timerange_df
start end value
1 08:30 11:29 1
2 11:35 15:09 2
3 15:10 02:29 3
4 05:00 08:29 4
像这样
timestamp value
1 2018-04-20 21:00:00 3
2 2018-04-21 00:49:00 3
3 2018-04-21 04:38:00 NA
4 2018-04-21 08:27:00 4
5 2018-04-21 12:16:00 2
6 2018-04-21 16:05:00 3
非常感谢任何帮助。
答案 0 :(得分:1)
sqldf
包为这种情况提供了更大的灵活性。方法是:
将
time_range_df
中的时间更改为偏离午夜。在
tdata_df
中添加一列以表示自午夜起所经过的时间从午夜开始加入两个数据帧重叠时间
library(lubridate)
time_range_df$start <- as.numeric(seconds(hm(time_range_df$start)))
time_range_df$end <- as.numeric(seconds(hm(time_range_df$end)))
tdata_df$timeSinceMidNigh <- as.numeric(seconds(hms(format(ymd_hms(tdata_df$timestamp),
format = "%H:%M:%S"))))
library(sqldf)
sqlquery <- "SELECT D1.timestamp, Q.value FROM tdata_df D1
LEFT JOIN (SELECT * FROM tdata_df D, time_range_df R
WHERE (R.start < R.end AND D.timeSinceMidNigh between R.start AND R.end) OR
(R.start > R.end AND D.timeSinceMidNigh between R.start AND 86400) OR
(R.start > R.end AND D.timeSinceMidNigh between 0 and R.end)) Q
ON D1.timestamp = Q.timestamp"
sqldf(sqlquery)
# timestamp value
# 1 2018-04-20 13:00:00 2
# 2 2018-04-20 16:49:00 3
# 3 2018-04-20 20:38:00 3
# 4 2018-04-21 00:27:00 3
# 5 2018-04-21 04:16:00 NA
# 6 2018-04-21 08:05:00 4
数据:强>
tdata_df <- data.frame(timestamp=seq(c(ISOdate(2018,4,20)), by = (60*229), length.out = 6))
time_range_df <- data.frame(start=c("08:30","11:35","15:10","05:00"),
end=c("11:29","15:09","02:29","08:29"),value=c(1,2,3,4))