循环进入R以从数据帧读取数据

时间:2020-02-28 02:25:23

标签: r

在下面的代码中,df是一个包含ID和Date变量的数据框。 df1是固定数据帧。我希望创建一个具有以下条件的新载体: 如果来自df的日期介于df1中的开始日期和结束日期之间,并且来自df的ID等于df1中的ID1,则代码将从df1返回相应的结果。但是,我收到了这些警告消息,如下面的代码所示。请帮忙。

> Date = as.Date(c("01/01/2012", "01/02/2015", "01/01/2018", "01/05/2019"), format = '%d/%m/%Y')
> ID = c(1,2,3,1)
> df = data.frame(ID, Date)
> 
> Start_Date = as.Date(c("01/01/2011", "01/01/2011", "01/01/2019"), format = '%d/%m/%Y')
> End_Date = as.Date(c("31/12/2018", "31/12/2019", "31/12/2019"), format = '%d/%m/%Y')
> ID1 = c(1,2,3)
> Result =c("A","B","C")
> df1 = data.frame(ID1,Start_Date,End_Date, Result)
> 
> for(i in 1:nrow(df1)) {
+ if(Date >= Start_Date[i] & Date <= End_Date[i] & ID == ID1[i]) {Result[i]}
+ }
Warning messages:
1: In if (Date >= Start_Date[i] & Date <= End_Date[i] & ID == ID1[i]) { :
  the condition has length > 1 and only the first element will be used
2: In if (Date >= Start_Date[i] & Date <= End_Date[i] & ID == ID1[i]) { :
  the condition has length > 1 and only the first element will be used
3: In if (Date >= Start_Date[i] & Date <= End_Date[i] & ID == ID1[i]) { :
  the condition has length > 1 and only the first element will be used

1 个答案:

答案 0 :(得分:1)

您可以merge,然后过滤行(如果它们在范围内):

subset(merge(df, df1, by.x = 'ID', by.y = 'ID1'), 
              Date >= Start_Date & Date <= End_Date)

#  ID       Date Start_Date   End_Date Result
#1  1 2012-01-01 2011-01-01 2018-12-31      A
#3  2 2015-02-01 2011-01-01 2019-12-31      B

使用dplyr可以通过以下方式完成:

library(dplyr)
inner_join(df, df1, by = c('ID' = 'ID1')) %>%
   filter(Date >= Start_Date & Date <= End_Date)

或使用fuzzyjoin

fuzzyjoin::fuzzy_inner_join(df, df1,  
      by = c('ID' = 'ID1', 'Date' = 'Start_Date', 'Date' = 'End_Date'), 
      match_fun = list(`==`, `>=`, `<=`))