如何使用dplyr和整理评估以编程方式过滤数据框?

时间:2017-07-16 23:50:20

标签: r dplyr tidyverse rlang

假设我想以编程方式过滤starwars数据框。这是一个简单的例子,让我根据家庭世界和物种进行过滤:

library(tidyverse)

# a function that allows the user to supply filters
filter_starwars <- function(filters) {
  for (filter in filters) {
    starwars = filter_at(starwars, filter$var, all_vars(. %in% filter$values))
  }

  return(starwars)
}

# filter Star Wars characters that are human, and from either Tatooine or Alderaan
filter_starwars(filters = list(
  list(var = "homeworld", values = c("Tatooine", "Alderaan")),
  list(var = "species", values = "Human")
))

但这不允许我指定一个高度过滤器,因为我在%in%的{​​{1}}中对.vars_predicate运算符进行了硬编码,并对高度过滤器进行了硬编码将使用filter_at()>>=<<=运算符之一

编写==函数的最佳方法是什么,以便用户可以提供足够通用的过滤器来过滤任何列并使用任何运算符?

NB使用现已弃用的filter_starwars()方法,我可以传递一个字符串:

filter_()

但同样,这已被弃用。

2 个答案:

答案 0 :(得分:13)

尝试

filter_starwars <- function(...) {
  F <- quos(...)
  filter(starwars, !!!F)
}

filter_starwars(species == 'Human', homeworld %in% c('Tatooine', 'Alderaan'), height > 175)
# # A tibble: 7 × 13
#                  name height  mass  hair_color skin_color eye_color birth_year
#                 <chr>  <int> <dbl>       <chr>      <chr>     <chr>      <dbl>
# 1         Darth Vader    202   136        none      white    yellow       41.9
# 2           Owen Lars    178   120 brown, grey      light      blue       52.0
# 3   Biggs Darklighter    183    84       black      light     brown       24.0
# 4    Anakin Skywalker    188    84       blond       fair      blue       41.9
# 5         Cliegg Lars    183    NA       brown       fair      blue       82.0
# 6 Bail Prestor Organa    191    NA       black        tan     brown       67.0
# 7     Raymus Antilles    188    79       brown      light     brown         NA
# # ... with 6 more variables: gender <chr>, homeworld <chr>, species <chr>,
# #   films <list>, vehicles <list>, starships <list>

https://cran.r-project.org/web/packages/dplyr/vignettes/programming.html。简而言之,quos...作为列表捕获,而不评估参数。 !!!filter()中拼接并取消引用评估的参数。

答案 1 :(得分:8)

以下是一些方法。

1)对于这个特殊问题,我们实际上并不需要filter_starwars <- function(...) { filter(starwars, ...) } # test filter_starwars(species == 'Human', homeworld %in% c('Tatooine', 'Alderaan'), height > 175) ) ,rlang或类似问题。这有效:

library(rlang)

filter_starwars <- function(...) {
    filter(starwars, !!!parse_exprs(paste(..., sep = ";")))
}

# test
filter_starwars("species == 'Human'", 
                "homeworld %in% c('Tatooine', 'Alderaan')", 
                "height > 175")

2)如果有字符参数很重要,那么:

library(rlang)

filter_starwars <- function(filters) {
    filter(starwars, !!!parse_exprs(paste(filters, collapse = ";")))
}

# test 
filter_starwars(c("species == 'Human'", 
                  "homeworld %in% c('Tatooine', 'Alderaan')", 
                  "height > 175"))

2a)或者如果要传递单个字符向量:

{{1}}