保留满足一个条件的行,如果满足另一个条件,则保留其上方的行

时间:2018-09-19 03:57:17

标签: r subset

我的数据集类似于以下数据集,但比以下数据集复杂得多:

df<-data.frame(ID = c(1,1,2,2,3,3,3), 
               week = c(20,21,10,15,20,21,22), 
               var1 = c(0,1,0,1,0,0,1))

  ID week var1
1  1   20    0
2  1   21    1
3  2   10    0
4  2   15    1
5  3   20    0
6  3   21    0
7  3   22    1

我想创建一个新的数据框,该框将保留var1 = 1的所有行,如果ID相同并且一周比包含的行少一整点,则保留前一行。新的数据框如下所示:

  ID week var1
1  1   20    0
2  1   21    1
3  2   15    1
4  3   21    0
5  3   22    1

我已经尝试了

df1<-df[which(df$var1 == 1) - 1, ]

但是,无论是否满足我的条件,这都会为我提供上一行。

我也尝试过dplyr的延迟

df2<-filter(df, var1==1 & lag(week)==week-1)

但是,这只给我满足这两个条件的行。我搜索的所有代码均会在其中一个或另一个结果中产生结果。

2 个答案:

答案 0 :(得分:0)

您可以依次处理每个条件:

对于您的数据框:

df<-data.frame(ID = c(1,1,2,2,3,3,3), 
               week = c(20,21,10,15,20,21,22), 
               var1 = c(0,1,0,1,0,0,1))

您要选择以下内容

#   ID week var1
# 1  1   20    0 # <- condition 2 + condition 3
# 2  1   21    1 # <- condition 1
# 3  2   10    0 # <- condition 2
# 4  2   15    1 # <- condition 1
# 5  3   20    0 #
# 6  3   21    0 # <- condition 2 + condition 3
# 7  3   22    1 # <- condition 1

并仅选择条件1和条件2 + 3的行:

## Condition 1: Selecting the rows with var1 = 1
rows_var1 <- which(df$var1 == 1)
rows_var1
# [1] 2 4 7

## Condition 2: Selecting all the previous rows with the same ID
same_ID <- (rows_var1 - 1)[(df$ID[rows_var1] == df$ID[rows_var1 - 1])]
same_ID
# [1] 1 3 6

## Condition 3: Selecting the same IDs with that equal to week-1
same_ID_week <- same_ID[df$week[same_ID] == (df$week[rows_var1] - 1)]
same_ID_week
# [1] 1 6

## Getting the table subset
df1 <- df[sort(c(rows_var1, same_ID_week)),]

#   ID week var1
# 1  1   20    0
# 2  1   21    1
# 3  2   15    1
# 4  3   21    0
# 5  3   22    1

答案 1 :(得分:0)

使用SQL,我们可以:

library(sqldf)

sqldf("select b.* from df a join df b on a.ID = b.ID and b.week = a.week - 1
       where a.var1 = 1
       union
       select * from df 
       where var1 = 1
       order by ID, week")

给予

  ID week var1
1  1   20    0
2  1   21    1
3  2   15    1
4  3   21    0
5  3   22    1