Question

这类似于this question和this one，但我似乎无法弄清如何使其适应我的情况。我有一个包含所有数值的1437：60数据框。第一列是“深度”，根据其他数据调查，我需要删除我认为异常值的深度（行）。

例如：

Test <- data.frame(Depth = seq(from = 0, to = 100, by = 0.5), X1 = runif(n = 201, min = 1, max = 10), X2 - runif(n = 201, min = 1, max = 10))

我想删除深度在46.5和48.5之间的行，以及深度在65.5和68.5之间的行。我曾尝试创建矢量并基于此进行过滤，例如

OutDepth <- c(seq(from = 46.5, to = 48.5, by = 0.5), seq(from = 65.5, to =  68.5, by = 0.5)

Test1 <- Test %>% filter(Depth == !OutDepths)

给出错误

longer object length is not a multiple of shorter object length

如果我尝试也会出现相同的错误

Test1 <- Test[Test$Depths == !OutDepths, ]

预先感谢您的任何建议

解决方案 事实证明，我只是将not（！）运算符放在错误的位置，所以我应该一直使用%in%而不是==。

例如

Test1 <- Test %>%
filter(!Depth %in% OutDepths)

或基本r

Test1 <- Test[!Test$Depth %in% OutDepths, ]

Answer 1

这是between函数的另一种选择。

library(dplyr)
df <- data.frame(depth = c(20,40,47,50,60,67,80,90,100,120))

df %>% 
    filter(!between(depth, 46.5, 48.5)) %>% 
    filter(!between(depth, 65.5, 68.5))



#   depth
#1    20
#2    40
#3    50
#4    60
#5    80
#6    90
#7   100
#8   120

Answer 2

尝试一下：

Test %>% 
  filter(Depth < 46.5 | Depth > 48.5) %>%
  filter(Depth < 65.5 | Depth > 68.5)

根据列中的值进行过滤

2 个答案: