Pass a data.frame column name to a function that uses purrr::map

时间:2018-06-04 16:44:18

标签: r dataframe purrr

I'm working with nested dataframes and want to pass the name of the top level dataframe, and the name of a column containing lower level dataframes, to a function that uses purrr::map to iterate over the lower level data frames.

Here's a toy example.

library(dplyr)
library(purrr)
library(tibble)
library(tidyr)

df1 <- tibble(x = c("a","b","c", "a","b","c"), y = 1:6)
df1 <- df1 %>%
  group_by(x) %>%
  nest()

testfunc1 <- function(df) {
  df <- df %>%
    mutate(out = map(data, min))
  tibble(min1 = df$out)
}

testfunc2 <- function(df, col_name) {
  df <- df %>%
    mutate(out = map(col_name, min))
  tibble(min2 = df$out)
}

df1 <- bind_cols(df1, testfunc1(df1))
df1 <- bind_cols(df1, testfunc2(df1, "data"))

df1$min1
df1$min2

testfunc1 behaves as expected, in this case giving the minimum of each data column in a new column. In testfunc2, where I've tried to pass the column name, a string reading "data" is passed to the new column. I think I understand from the thread here (Pass a data.frame column name to a function) why this doesn't behave as I want, but I haven't been able to figure out how to make it work in this case. Any suggestions would be great.

1 个答案:

答案 0 :(得分:4)

This should work for you, it uses the tidy eval framework. This assumes col_name is a string.

testfunc2 <- function(df, col_name) {
     df <- df %>%
          mutate(out = map(!! rlang::sym(col_name), min))
    tibble(min2 = df$out)

}

EDIT:

If you'd rather pass a bare column name to the function, instead of a string, use enquo instead of sym.

testfunc2 <- function(df, col_name) {
     col_quo = enquo(col_name)
     df <- df %>%
          mutate(out = map(!! col_quo, min))
     tibble(min2 = df$out)

}