我想使用循环过滤数据帧的多列,删除任何给定列值在特定列表中的行。
例如:
> my_df <- data.frame(word1 = c("one", "two", "red", "blue"), word2 = c("apple","orange","banana","pear"), word3 = c("red", "orange", "yellow", "green"))
> color_words = c("red", "orange", "yellow", "green", "blue")
> my_df
word1 word2 word3
1 one apple red
2 two orange orange
3 red banana yellow
4 blue pear green
使用dplyr filter()
函数:
> my_df %>% filter(!word1 %in% color_words) %>% filter(!word2 %in% color_words)
word1 word2 word3
1 one apple red
我第一次尝试在循环中执行此过滤是:
col_names <- c("word1","word2")
for(col in col_names){
my_df <- my_df %>% filter(!col %in% color_words)
}
> my_df
word1 word2 word3
1 one apple red
2 two orange orange
3 red banana yellow
4 blue pear green
在使用filter()
时,我了解了quoting and unquoting,所以我也尝试过:
for(col in col_names){
col <- enquo(col)
my_df <- my_df %>% filter(!UQ(col) %in% color_words)
}
> my_df
word1 word2 word3
1 one apple red
2 two orange orange
3 red banana yellow
4 blue pear green
和
for(col in col_names){
my_df <- my_df %>% filter(!UQ(col) %in% color_words)
}
> my_df
word1 word2 word3
1 one apple red
2 two orange orange
3 red banana yellow
4 blue pear green
通过循环进行此过滤的正确方法是什么?
答案 0 :(得分:2)
您不需要循环,可以将filter
与across
一起使用,以将函数应用于多列
library(dplyr)
my_df %>% filter(across(all_of(col_names), ~!. %in% color_words))
# word1 word2 word3
#1 one apple red
如果您使用的是dplyr
的旧版本,请使用filter_at
:
my_df %>% filter_at(col_names, all_vars(!. %in% color_words))
答案 1 :(得分:0)
使用base
my_df <- data.frame(word1 = c("one", "two", "red", "blue"), word2 = c("apple","orange","banana","pear"), word3 = c("red", "orange", "yellow", "green"))
color_words <- paste0(c("red", "orange", "yellow", "green", "blue"), collapse = "|")
fltr <- apply(my_df[1:2], 1, function(x) !any(grepl(color_words, x)))
my_df[fltr, ]
#> word1 word2 word3
#> 1 one apple red
由reprex package(v0.3.0)于2020-09-25创建