根据日期范围和变量匹配来设置数据

时间:2017-04-15 17:50:49

标签: r subset

我有两个不同的数据集,我想根据第二个变量的组合来获取第一个的子集。具体来说,我想看一下日期范围和直接变量匹配的组合。

查看下面创建的数据框:

options(stringsAsFactors = FALSE)


Loc<-rep("A",10)
InEvent<-rep("IN",10)
InDate<-c("2016-05-10","2016-05-20","2016-05-25","2016-06-10","2016-06-20","2016-07-05","2016-07-17","2016-07-27","2016-08-10","2016-08-20")
InSN<-c("H1","H1","H1","H1","H1","H2","H2","H2","H2","H2")
OutEvent<-rep("OUT",10)
OutDate<-c("2016-05-15","2016-05-23","2016-06-02","2016-06-14","2016-06-26","2016-07-09","2016-07-26","2016-08-09","2016-08-19","2016-08-26")
OutSN<-c("H1","H1","H1","H1","H1","H2","H2","H2","H2","H2")

Cali<-data.frame(Loc,InEvent,InDate,InSN,OutEvent,OutDate,OutSN)

Cali$InDate<-as.POSIXct(strptime(Cali$InDate,format="%Y-%m-%d", tz="UTC"))

Cali$OutDate<-as.POSIXct(strptime(Cali$OutDate,format="%Y-%m-%d", tz="UTC"))
Cali



Sen<-rep("CL",20)
Date<-c("2016-04-10","2016-05-11","2016-05-12","2016-05-13","2016-05-17","2016-05-26","2016-06-17","2016-06-27","2016-07-08","2016-07-20","2016-07-27","2016-08-01","2016-08-05","2016-08-07","2016-08-12","2016-08-15","2016-08-19","2016-08-20","2016-08-23","2016-09-20")
SN<-c("H1","H1","H2","H5","H1","H1","H7","H2","H2","H2","H1","H2","H1","H5","H2","H5","H3","H1","5","H2")


Data<-data.frame(Sen,Date,SN)


Data$Date<-as.POSIXct(strptime(Data$Date,format="%Y-%m-%d", tz="UTC"))
Data

在最终结果中,我只希望“数据”数据框中的行位于“Cali”的日期范围内,但也匹配“InSN”和“OutSN”中的H值。

例如,Cali的第一行的范围为2016-5-10:2016-5-15,SN值为H1。因此,我只希望“数据”中的行在此日期范围内并且在“SN”列中具有“H1”。

结果数据列应该是“数据”的子集,仅包括符合匹配条件的行(第2,6,9,10,12,15行)

1 个答案:

答案 0 :(得分:1)

library(dplyr)
library(magrittr)
Data=left_join(Cali, Data, by = c("InSN"="SN")) %>%
  filter(Date>=InDate, Date<=OutDate)