具有lapply功能的用户定义函数

时间:2014-12-19 00:01:38

标签: r function lapply

我正在尝试建立一个用户定义的函数,该函数从活动数据帧输入预定变量(独立和相关)。让我们以下面的示例数据框df查看由于其他记录变量导致的抛硬币结果:

> df
  outcome toss    person  hand age
1       H    1      Mary  Left  18
2       T    2     Allen  Left  12
3       T    3       Dom  Left  25
4       T    4 Francesca  Left  42
5       H    5      Mary Right  18
6       H    6     Allen Right  12
7       H    7       Dom Right  25
8       T    8 Francesca Right  42

df数据框的二项式回复outcome是头部或尾部,我将查看personhand和{{1}的方式可能会影响这种分类结果。我计划使用前向选择方法,它将针对age测试一个变量,然后进行添加更多。

为了简单起见,我希望能够识别响应/依赖(例如,toss)和预测器/独立(例如,outcomeperson)变量之前我的用户定义函数:

hand

然后使用> independent<-c('person','hand','age') > dependent<-'outcome' lapply函数创建我的函数:

glm

然而,当我尝试使用预定的向量运行函数时,会发生这种情况:

> test.func<-function(some_data,the_response,the_predictors)
+ {
+     lapply(the_predictors,function(a)
+         {
+         glm(substitute(as.name(the_response)~i,list(i=as.name(a))),data=some_data,family=binomial)
+     })
+ }

我的预期回应如下:

> test.func(df,dependent,independent)
Error in as.name(the_response) : object 'the_response' not found

正如您所知,使用models<-lapply(independent,function(x) + { + glm(substitute(outcome~i,list(i=as.name(x))),data=df,family=binomial) + }) > models [[1]] Call: glm(formula = substitute(outcome ~ i, list(i = as.name(x))), family = binomial, data = df) Coefficients: (Intercept) personDom personFrancesca personMary 1.489e-16 -1.799e-16 1.957e+01 -1.957e+01 Degrees of Freedom: 7 Total (i.e. Null); 4 Residual Null Deviance: 11.09 Residual Deviance: 5.545 AIC: 13.55 [[2]] Call: glm(formula = substitute(outcome ~ i, list(i = as.name(x))), family = binomial, data = df) **End Snippet** lapply,我创建了3个简单模型,而没有单独完成所有额外工作。当你有简单的代码时,你可能会问为什么要创建一个用户定义的函数?我计划运行glmwhile循环,这样可以减少混乱。

感谢您的帮助

1 个答案:

答案 0 :(得分:2)

我知道代码只有答案被弃用但我认为你几乎就在那里,可以使用nudge来使用formula函数(并在替换中包含'the_response):

 test.func<-function(some_data,the_response,the_predictors)
 {
     lapply(the_predictors,function(a)
         {print(   form<- formula(substitute(resp~i,
                                             list(resp=as.name(the_response), i=as.name(a)))))
         glm(form, data=some_data,family=binomial)
     })
 }

测试:

> test.func(df,dependent,independent)
outcome ~ person
<environment: 0x7f91a1ba5588>
outcome ~ hand
<environment: 0x7f91a2b38098>
outcome ~ age
<environment: 0x7f91a3fad468>
[[1]]

Call:  glm(formula = form, family = binomial, data = some_data)

Coefficients:
    (Intercept)        personDom  personFrancesca       personMary  
      8.996e-17       -1.540e-16        1.957e+01       -1.957e+01  

Degrees of Freedom: 7 Total (i.e. Null);  4 Residual
Null Deviance:      11.09 
Residual Deviance: 5.545    AIC: 13.55

[[2]]

Call:  glm(formula = form, family = binomial, data = some_data)

#snipped