有没有办法使用dplyr的过滤功能从数据框中打印每个过滤器操作过滤器的行数?
考虑一个过滤的简单示例数据框:
test.df <- data.frame(col1 = c(1,2,3,4,4,5,5,5))
filtered.df <- test.df %>% filter(col1 != 4, col1 != 5)
我希望输出这段代码:
到目前为止,我在创建自己的功能时尝试了什么
print_filtered_rows <- function(dataframe, ...) {
dataframe_new <- dataframe
for(arg in list(...)) {
print(arg)
dataframe <- dataframe_new
dataframe_new <- dataframe %>% filter(arg)
rows_filtered <- nrow(dataframe) - nrow(data_fram_new)
print(sprintf('Filtered out %s rows using: %s', rows_filtered, arg)
}
return(dataframe_new)
}
但我无法掌握实际上是什么以及如何使用它。我读过:
http://adv-r.had.co.nz/Functions.html#function-arguments
但这对我没有帮助。
答案 0 :(得分:3)
非常接近!您实际上正在寻找Non-Standard Evaluation上的章节。
library(dplyr)
print_filtered_rows <- function(dataframe, ...) {
df <- dataframe
vars = as.list(substitute(list(...)))[-1L]
for(arg in vars) {
dataframe <- df
dataframe_new <- dataframe %>% filter(arg)
rows_filtered <- nrow(df) - nrow(dataframe_new)
cat(sprintf('Filtered out %s rows using: %s\n', rows_filtered, deparse(arg)))
df = dataframe_new
}
return(dataframe_new)
}
data(iris)
iris %>%
print_filtered_rows(Species == "virginica", Species != "virginica") %>%
head()
#> Filtered out 100 rows using: Species == "virginica"
#> Filtered out 50 rows using: Species != "virginica"
#> [1] Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <0 rows> (or 0-length row.names)