根据列中的值进行过滤

时间:2020-06-28 15:47:18

标签: r filter dplyr subset

这类似于this questionthis one,但我似乎无法弄清如何使其适应我的情况。我有一个包含所有数值的1437:60数据框。第一列是“深度”,根据其他数据调查,我需要删除我认为异常值的深度(行)。

例如:

Test <- data.frame(Depth = seq(from = 0, to = 100, by = 0.5), X1 = runif(n = 201, min = 1, max = 10), X2 - runif(n = 201, min = 1, max = 10))

我想删除深度在46.5和48.5之间的行,以及深度在65.5和68.5之间的行。我曾尝试创建矢量并基于此进行过滤,例如

OutDepth <- c(seq(from = 46.5, to = 48.5, by = 0.5), seq(from = 65.5, to =  68.5, by = 0.5)

Test1 <- Test %>% filter(Depth == !OutDepths)

给出错误

longer object length is not a multiple of shorter object length

如果我尝试也会出现相同的错误

Test1 <- Test[Test$Depths == !OutDepths, ]

预先感谢您的任何建议

解决方案 事实证明,我只是将not(!)运算符放在错误的位置,所以我应该一直使用%in%而不是==

例如

Test1 <- Test %>%
filter(!Depth %in% OutDepths)

或基本r

Test1 <- Test[!Test$Depth %in% OutDepths, ]

2 个答案:

答案 0 :(得分:4)

这是between函数的另一种选择。

library(dplyr)
df <- data.frame(depth = c(20,40,47,50,60,67,80,90,100,120))

df %>% 
    filter(!between(depth, 46.5, 48.5)) %>% 
    filter(!between(depth, 65.5, 68.5))



#   depth
#1    20
#2    40
#3    50
#4    60
#5    80
#6    90
#7   100
#8   120

答案 1 :(得分:3)

尝试一下:

Test %>% 
  filter(Depth < 46.5 | Depth > 48.5) %>%
  filter(Depth < 65.5 | Depth > 68.5)