我有以下2个数据帧:
Dataframe1 <- data.frame(Time = seq(as.POSIXct("2017-09-06 4:30:00"), as.POSIXct("2017-09-08 15:00:15"), by = "15 min"))
Dataframe2 <- data.frame(Start_Date = as.POSIXct(c("2017-09-07 4:32:00", "2017-09-07 13:02:00", "2017-09-08 10:20:00")), End_Date = as.POSIXct(c("2017-09-07 7:20:00", "2017-09-07 17:46:00", "2017-09-08 13:41:00")))
我要在Dataframe1
类的Dataframe1$New_Column
("logical"
)中创建一个新列。如果Dataframe1$Time
中的值在开始日期和结束日期之间(即,如果它们在Dataframe2
每行的两个日期之间),则Dataframe1$New_Column
将是TRUE
,并且如果不是,则Dataframe1$New_Column
将是FALSE
。结果应如下所示:
Dataframe1$New_Column <- TRUE
Dataframe1$New_Column[which(Dataframe1$Time > Dataframe2$Start_Date[1] & Dataframe1$Time< Dataframe2$End_Date[1])] <- F
Dataframe1$New_Column[which(Dataframe1$Time > Dataframe2$Start_Date[2] & Dataframe1$Time< Dataframe2$End_Date[2])] <- F
Dataframe1$New_Column[which(Dataframe1$Time > Dataframe2$Start_Date[3] & Dataframe1$Time< Dataframe2$End_Date[3])] <- F
View(Dataframe1)
使用基本R函数执行此操作的有效方法是什么?
谢谢!
答案 0 :(得分:1)
使用非平等参加可能会更好
library(data.table)
Dataframe1$New_Column <- TRUE
setDT(Dataframe1)[Dataframe2, New_Column := FALSE,
on = .(Time > Start_Date, Time < End_Date)]
which(!Dataframe1$New_Column)
#[1] 98 99 100 101 102 103 104 105 106 107 108 132 133 134 135 136
#[17] 137 138 139 140 141 142 143 144 145 146 147 148 149 150
使用base R
,我们可以使用lapply/sapply
遍历“ Dataframe2”的行并进行比较
out <- !Reduce(`|`, lapply(seq_len(nrow(Dataframe2)),
function(i) with(Dataframe1, Time > Dataframe2$Start_Date[i] &
Time < Dataframe2$End_Date[i])))
which(!out)
#[1] 98 99 100 101 102 103 104 105 106 107 108 132 133 134 135 136
#[17] 137 138 139 140 141 142 143 144 145 146 147 148 149 150
Dataframe1$New_Column <- out