从R中的时间范围数据帧中选择值

时间:2018-04-20 05:02:33

标签: r

我有一个日期时间数据框

tdata_df <- data.frame(timestamp=seq(c(ISOdate(2018,4,20)), by = (60*229), length.out = 6))

tdata_df

            timestamp
1 2018-04-20 21:00:00
2 2018-04-21 00:49:00
3 2018-04-21 04:38:00
4 2018-04-21 08:27:00
5 2018-04-21 12:16:00
6 2018-04-21 16:05:00

然后我想从这个时间范围表中获得价值

time_range_df <- data.frame(start=c("08:30","11:35","15:10","05:00"),                     
               end=c("11:29","15:09","02:29","08:29"),value=c(1,2,3,4))

timerange_df

   start   end value
 1 08:30 11:29     1
 2 11:35 15:09     2
 3 15:10 02:29     3
 4 05:00 08:29     4
像这样

            timestamp value
1 2018-04-20 21:00:00     3
2 2018-04-21 00:49:00     3
3 2018-04-21 04:38:00    NA
4 2018-04-21 08:27:00     4
5 2018-04-21 12:16:00     2
6 2018-04-21 16:05:00     3

非常感谢任何帮助。

1 个答案:

答案 0 :(得分:1)

sqldf包为这种情况提供了更大的灵活性。方法是:

  

time_range_df中的时间更改为偏离午夜。

     

tdata_df中添加一列以表示自午夜起所经过的时间

     

从午夜开始加入两个数据帧重叠时间

library(lubridate)
time_range_df$start <- as.numeric(seconds(hm(time_range_df$start)))
time_range_df$end <- as.numeric(seconds(hm(time_range_df$end)))

tdata_df$timeSinceMidNigh <- as.numeric(seconds(hms(format(ymd_hms(tdata_df$timestamp),
              format = "%H:%M:%S"))))


library(sqldf)


sqlquery <- "SELECT D1.timestamp, Q.value FROM tdata_df D1
             LEFT JOIN (SELECT * FROM tdata_df D, time_range_df R
             WHERE  (R.start < R.end AND D.timeSinceMidNigh between R.start AND R.end) OR
             (R.start > R.end AND D.timeSinceMidNigh between R.start AND 86400) OR
             (R.start > R.end AND D.timeSinceMidNigh between 0 and R.end)) Q
             ON D1.timestamp = Q.timestamp"




sqldf(sqlquery)
# timestamp             value
# 1 2018-04-20 13:00:00     2
# 2 2018-04-20 16:49:00     3
# 3 2018-04-20 20:38:00     3
# 4 2018-04-21 00:27:00     3
# 5 2018-04-21 04:16:00    NA
# 6 2018-04-21 08:05:00     4

数据:

tdata_df <- data.frame(timestamp=seq(c(ISOdate(2018,4,20)), by = (60*229), length.out = 6))

time_range_df <- data.frame(start=c("08:30","11:35","15:10","05:00"),                     
               end=c("11:29","15:09","02:29","08:29"),value=c(1,2,3,4))