R:使用&#34过滤两列不等于" operator dplyr / subset

时间:2017-11-08 23:54:04

标签: r dplyr

这个问题必须在之前得到解答,但我无法在任何地方找到。我需要使用两列中的值来过滤/子集数据帧以删除它们。在示例中,我希望将所有不相等的行(!=)保持为复制" 1"和治疗" a"。但是,子集和过滤器函数都会删除所有复制1和所有处理a。我可以通过使用哪个然后索引来解决它,但它不是使用管道运算符的最佳方法。你知道为什么过滤器/子集只在两个条件都为真时才进行过滤吗?

require(dplyr)

#Create example dataframe
replicate = rep(c(1:3), times = 4)
treatment = rep(c("a","b"), each = 6)

df = data.frame(replicate, treatment)

#filtering data    
> filter(df, replicate!=1, treatment!="a")
  replicate treatment
1         2         b
2         3         b
3         2         b
4         3         b
> subset(df, (replicate!=1 & treatment!="a"))
   replicate treatment
8          2         b
9          3         b
11         2         b
12         3         b

#solution by which - indexing
index = which(df$replicate==1 & df$treatment=="a")
> df[-index,]
   replicate treatment
2          2         a
3          3         a
5          2         a
6          3         a
7          1         b
8          2         b
9          3         b
10         1         b
11         2         b
12         3         b

1 个答案:

答案 0 :(得分:3)

我认为你想在这里使用“或”条件。这看起来如何:

require(dplyr)

#Create example dataframe
replicate = rep(c(1:3), times = 4)
treatment = rep(c("a","b"), each = 6)

df = data.frame(replicate, treatment)
df %>% 
  filter(replicate != 1 | treatment != "a")

   replicate treatment
1          2         a
2          3         a
3          2         a
4          3         a
5          1         b
6          2         b
7          3         b
8          1         b
9          2         b
10         3         b