如何使用dplyr和ggplot2将列名称作为函数参数传递?

时间:2017-07-26 17:07:53

标签: r ggplot2 dplyr

我正在尝试编写一个能够吐出模型诊断图的函数。

to_plot <- function(df, model, response_variable, indep_variable) {
  resp_plot <- 
    df %>%
    mutate(model_resp = predict.glm(model, df, type = 'response')) %>%
    group_by(indep_variable) %>%
    summarize(actual_response = mean(response_variable),
              predicted_response = mean(model_resp)) %>%
    ggplot(aes(indep_variable)) + 
    geom_line(aes(x = indep_variable, y = actual_response, colour = "actual")) + 
    geom_line(aes(x = indep_variable, y = predicted_response, colour = "predicted")) +
    ylab(label = 'Response')

}

当我在数据集上运行时,dplyr会抛出一个我不理解的错误:

fit <- glm(data = mtcars, mpg ~ wt + qsec + am, family = gaussian(link = 'identity')
to_plot(mtcars, fit, mpg, wt)

 Error in grouped_df_impl(data, unname(vars), drop) : 
  Column `indep_variable` is unknown 

基于一些粗略的调试,我发现错误发生在group_by步骤中,因此它可能与我如何调用函数中的列有关。谢谢!

1 个答案:

答案 0 :(得分:1)

此代码似乎解决了这个问题。作为上面提到的评论者,传入函数的变量必须包含在&#34; enquo&#34;功能,然后用!!打开。注意aes()函数在处理字符串时变为aes_()。

library(tidyverse)

to_plot <- function(df, model, response_variable, indep_variable) {
  response_variable <- enquo(response_variable)
  indep_variable <- enquo(indep_variable)

  resp_plot <- 
    df %>%
    mutate(model_resp = predict.glm(model, df, type = 'response')) %>%
    group_by(!!indep_variable) %>%
    summarize(actual_response = mean(!!response_variable),
              predicted_response = mean(model_resp)) %>%
    ggplot(aes_(indep_variable)) + 
    geom_line(aes_(x = indep_variable, y = quote(actual_response)), colour = "blue") + 
    geom_line(aes_(x = indep_variable, y = quote(predicted_response)), colour = "red") +
    ylab(label = 'Response')

  return(resp_plot)
}

fit <- glm(data = mtcars, mpg ~ wt + qsec + am, family = gaussian(link = 'identity'))
to_plot(mtcars, fit, mpg, wt)