将列名传递给R中的函数时出现神秘错误

时间:2017-05-14 21:29:31

标签: r dplyr tidyr spread

我正在创建一个拉动数据框并将factor变量传播到新的虚拟变量的函数,因为某些机器学习算法无法处理因子。为此,我在清理功能中使用spread()功能。

当我尝试传递我需要传播的列的名称时,它会抛出错误:

Error: Invalid column specification

以下是代码:

library(tidyr)
library(dplyr)    
library(C50) # this is one source for the churn data
data(churn)


f <- function(df, name)  {
  df$dummy <- c(1:nrow(df))       # create dummy variable with unique values

  df <- spread(df, key <- as.character(substitute(name)), "dummy", fill = 0 )
}

churnTrain = f(churnTrain, name = "state")
str(churnTrain)

当然,如果我用key = as.character(substitute(name))替换key = "state",它的工作正常,但整个函数失去了它的可重用性。

如何将列名传递给内部函数而不出错?

2 个答案:

答案 0 :(得分:0)

您需要使用tidyverse吗?

如果没有,您可以尝试旧的reshape2包:


library(reshape2)
library(C50) # this is one source for the churn data
data(churn)

f <- function(df1, name)  {
  df1$dummy <- 1:nrow(df1)  # create dummy variable with unique values
  df1 <- dcast(df1, as.formula(paste0("dummy~", name)))
}

ct1 <- f(churnTrain, name = "state")

如果您绝对需要在tidyverse工作,可以尝试按照http://dplyr.tidyverse.org/articles/programming.html上的教程进行操作。不幸的是,他们的例子在我的机器上不起作用。

答案 1 :(得分:0)

library(tidyr)
library(dplyr)    
library(C50) # this is one source for the churn data
data(churn)


f <- function(df, name)  {
  df$dummy <- c(1:nrow(df))       # create dummy variable with unique values

  df <- spread_(df, key = name, "dummy", fill = 0 )
}

churnTrain = f(churnTrain, name = "state")
str(churnTrain)
相关问题