我有两个不同的数据集,我想根据第二个变量的组合来获取第一个的子集。具体来说,我想看一下日期范围和直接变量匹配的组合。
查看下面创建的数据框:
options(stringsAsFactors = FALSE)
Loc<-rep("A",10)
InEvent<-rep("IN",10)
InDate<-c("2016-05-10","2016-05-20","2016-05-25","2016-06-10","2016-06-20","2016-07-05","2016-07-17","2016-07-27","2016-08-10","2016-08-20")
InSN<-c("H1","H1","H1","H1","H1","H2","H2","H2","H2","H2")
OutEvent<-rep("OUT",10)
OutDate<-c("2016-05-15","2016-05-23","2016-06-02","2016-06-14","2016-06-26","2016-07-09","2016-07-26","2016-08-09","2016-08-19","2016-08-26")
OutSN<-c("H1","H1","H1","H1","H1","H2","H2","H2","H2","H2")
Cali<-data.frame(Loc,InEvent,InDate,InSN,OutEvent,OutDate,OutSN)
Cali$InDate<-as.POSIXct(strptime(Cali$InDate,format="%Y-%m-%d", tz="UTC"))
Cali$OutDate<-as.POSIXct(strptime(Cali$OutDate,format="%Y-%m-%d", tz="UTC"))
Cali
Sen<-rep("CL",20)
Date<-c("2016-04-10","2016-05-11","2016-05-12","2016-05-13","2016-05-17","2016-05-26","2016-06-17","2016-06-27","2016-07-08","2016-07-20","2016-07-27","2016-08-01","2016-08-05","2016-08-07","2016-08-12","2016-08-15","2016-08-19","2016-08-20","2016-08-23","2016-09-20")
SN<-c("H1","H1","H2","H5","H1","H1","H7","H2","H2","H2","H1","H2","H1","H5","H2","H5","H3","H1","5","H2")
Data<-data.frame(Sen,Date,SN)
Data$Date<-as.POSIXct(strptime(Data$Date,format="%Y-%m-%d", tz="UTC"))
Data
在最终结果中,我只希望“数据”数据框中的行位于“Cali”的日期范围内,但也匹配“InSN”和“OutSN”中的H值。
例如,Cali的第一行的范围为2016-5-10:2016-5-15,SN值为H1。因此,我只希望“数据”中的行在此日期范围内并且在“SN”列中具有“H1”。
结果数据列应该是“数据”的子集,仅包括符合匹配条件的行(第2,6,9,10,12,15行)
答案 0 :(得分:1)
library(dplyr)
library(magrittr)
Data=left_join(Cali, Data, by = c("InSN"="SN")) %>%
filter(Date>=InDate, Date<=OutDate)