在R中的循环内过滤数据帧的多列

时间:2020-09-25 14:18:49

标签: r dataframe dplyr

我想使用循环过滤数据帧的多列,删除任何给定列值在特定列表中的行。

例如:

> my_df <- data.frame(word1 = c("one", "two", "red", "blue"), word2 = c("apple","orange","banana","pear"), word3 = c("red", "orange", "yellow", "green"))
> color_words = c("red", "orange", "yellow", "green", "blue")
> my_df
  word1  word2  word3
1   one  apple    red
2   two orange orange
3   red banana yellow
4  blue   pear  green

使用dplyr filter()函数:

> my_df %>% filter(!word1 %in% color_words) %>% filter(!word2 %in% color_words)
  word1 word2 word3
1   one apple   red

我第一次尝试在循环中执行此过滤是:

col_names <- c("word1","word2")
for(col in col_names){
    my_df <- my_df %>% filter(!col %in% color_words)
}
> my_df
  word1  word2  word3
1   one  apple    red
2   two orange orange
3   red banana yellow
4  blue   pear  green

在使用filter()时,我了解了quoting and unquoting,所以我也尝试过:

for(col in col_names){
    col <- enquo(col)
    my_df <- my_df %>% filter(!UQ(col) %in% color_words)
}
> my_df
  word1  word2  word3
1   one  apple    red
2   two orange orange
3   red banana yellow
4  blue   pear  green

for(col in col_names){
    my_df <- my_df %>% filter(!UQ(col) %in% color_words)
}
> my_df
  word1  word2  word3
1   one  apple    red
2   two orange orange
3   red banana yellow
4  blue   pear  green

通过循环进行此过滤的正确方法是什么?

2 个答案:

答案 0 :(得分:2)

您不需要循环,可以将filteracross一起使用,以将函数应用于多列

library(dplyr)
my_df %>% filter(across(all_of(col_names), ~!. %in% color_words))

#  word1 word2 word3
#1   one apple   red

如果您使用的是dplyr的旧版本,请使用filter_at

my_df %>% filter_at(col_names, all_vars(!. %in% color_words))

答案 1 :(得分:0)

使用base

my_df <- data.frame(word1 = c("one", "two", "red", "blue"), word2 = c("apple","orange","banana","pear"), word3 = c("red", "orange", "yellow", "green"))
color_words <-  paste0(c("red", "orange", "yellow", "green", "blue"), collapse = "|") 
fltr <- apply(my_df[1:2], 1, function(x) !any(grepl(color_words, x)))
my_df[fltr, ]
#>   word1 word2 word3
#> 1   one apple   red

reprex package(v0.3.0)于2020-09-25创建

相关问题