将参数列表传递给mutate_at

时间:2019-10-09 22:51:05

标签: r dplyr

我敢肯定有办法做到这一点,但我不知道。我希望能够在函数内将参数列表传递给mutate_at(),而不必指定每个参数

library(tidyverse)

fake_data <-
  tibble(
    id = letters[1:6],
    ind_group_a = rep(0:1, times = 3),
    ind_group_b = rep(1:0, each = 3)
  )

#  id    ind_group_a ind_group_b
#   a              0           1
#   b              1           1
#   c              0           1
#   d              1           0
#   e              0           0
#   f              1           0

然后该函数将全1转换为“是”,将0转换为“否”

recode_indicator <- function(x, if_1 = "yes", if_0 = "no") {
  ifelse(x == 1, if_1, if_0)
}

我可以像这样很好地使用它:

fake_data %>%
  mutate_at(
    vars(starts_with("ind_")),
    recode_indicator,
    if_1 = "Has",
    if_0 = "Missing"
  )

# id    ind_group_a ind_group_b
# chr> <chr>       <chr>      
#  a     Missing     Has        
#  b     Has         Has        
#  c     Missing     Has        
#  d     Has         Missing    
#  e     Missing     Missing    
#  f     Has         Missing 

这是一个简化的示例,但是我想做的是使它在函数中可用,而不必写出所有参数。理想情况下,像binary_values = list(...)这样的简短内容,但我不知道如何将这些项目作为recode_indicator()

的附加参数来传递
roll_up_indicators <- function(x,
                               #binary_values = list(if_1 = "yes", if_0 = "no"),
                               ...) {

  ind_cols <- grep("^ind_", names(x))

  df <-
    x %>%
    rename_at(ind_cols, str_remove, "^ind_") %>% 
    mutate_at(
      ind_cols,
      recode_indicator # ,
      # binary_values # <- here's the problem area
    ) %>%
    group_by_at(ind_cols) %>%
    count() %>%
    ungroup()

  knitr::kable(df, ...)
}


fake_data %>% roll_up_indicators()

#  |group_a |group_b |  n|
#  |:-------|:-------|--:|
#  |No      |No      |  1|
#  |No      |Yes     |  2|
#  |Yes     |No      |  2|
#  |Yes     |Yes     |  1|

更新

就不重写所有参数而言,可以使用formals()函数:

roll_up_indicators <- function(x,
                               binary_values = formals(recode_indicator), # <--- formals
                               ...) {

  ind_cols <- grep("^ind_", names(x))

  df <-
    x %>%
    rename_at(ind_cols, str_remove, "^ind_") %>%
    mutate_at(
      ind_cols,
      partial(recode_indicator, !!!binary_values) # <--- the winning answer
    ) %>%
    group_by_at(ind_cols) %>%
    count() %>%
    ungroup()

  knitr::kable(df, ...)
}

2 个答案:

答案 0 :(得分:1)

最好使用预制功能,例如重新编码,但是如果您想添加其他功能,我也对您的功能进行了修改。为此,我假设binary_values被适当命名,并且将永远只包含两个值。

选项1:使用recode

这要求您将起始值和结束值放在列表中。显然,您需要用引号引起来,并在数字两边加上引号或使用。

binary_values = list("1" = "yes", "0" = "no") 
fake_data %>% 
  mutate_at(vars(starts_with("ind_")),
            list(~recode(.,!!!binary_values)))

选项2:在函数内的列表中指定位置或名称

recode_value <- function(x, 
                         binary_values = list(if_1 = "yes", if_0 = "no")
                         ## You'll need to decide whether you'll name them as expected or always put them in this order; it's up to you
                         ) {
  if_1 = binary_values$if_1 # or binary_values[[1]]
  if_0 = binary_values$if_0 # or binary_values[[1]]
  ifelse(x == 1, if_1, if_0)
}

binary_values = list(if_1 = "yes", if_0 = "no")
fake_data %>%
  mutate_at(
    vars(starts_with("ind_")),
    recode_value, ## fixed typo
    binary_values
  )

答案 1 :(得分:1)

一种解决方案是使用purrr::partial指定if_1if_0自变量应来自binary_values

roll_up_indicators <- function(x,
                               binary_values = list(if_1 = "yes", if_0 = "no"),
                               ...) {

  ind_cols <- grep("^ind_", names(x))

  df <-
    x %>%
    rename_at(ind_cols, str_remove, "^ind_") %>%
    mutate_at(
      ind_cols,
      partial(recode_indicator, !!!binary_values)    ## <--- partial() here
    ) %>%
    group_by_at(ind_cols) %>%
    count() %>%
    ungroup()

  knitr::kable(df, ...)
}

fake_data %>% roll_up_indicators()
#  |group_a |group_b |  n|
#  |:-------|:-------|--:|
#  |No      |No      |  1|
#  |No      |Yes     |  2|
#  |Yes     |No      |  2|
#  |Yes     |Yes     |  1|