我正在尝试建立一个用户定义的函数,该函数从活动数据帧输入预定变量(独立和相关)。让我们以下面的示例数据框df
查看由于其他记录变量导致的抛硬币结果:
> df
outcome toss person hand age
1 H 1 Mary Left 18
2 T 2 Allen Left 12
3 T 3 Dom Left 25
4 T 4 Francesca Left 42
5 H 5 Mary Right 18
6 H 6 Allen Right 12
7 H 7 Dom Right 25
8 T 8 Francesca Right 42
df
数据框的二项式回复outcome
是头部或尾部,我将查看person
,hand
和{{1}的方式可能会影响这种分类结果。我计划使用前向选择方法,它将针对age
测试一个变量,然后进行添加更多。
为了简单起见,我希望能够识别响应/依赖(例如,toss
)和预测器/独立(例如,outcome
,person
)变量之前我的用户定义函数:
hand
然后使用> independent<-c('person','hand','age')
> dependent<-'outcome'
和lapply
函数创建我的函数:
glm
然而,当我尝试使用预定的向量运行函数时,会发生这种情况:
> test.func<-function(some_data,the_response,the_predictors)
+ {
+ lapply(the_predictors,function(a)
+ {
+ glm(substitute(as.name(the_response)~i,list(i=as.name(a))),data=some_data,family=binomial)
+ })
+ }
我的预期回应如下:
> test.func(df,dependent,independent)
Error in as.name(the_response) : object 'the_response' not found
正如您所知,使用models<-lapply(independent,function(x)
+ {
+ glm(substitute(outcome~i,list(i=as.name(x))),data=df,family=binomial)
+ })
> models
[[1]]
Call: glm(formula = substitute(outcome ~ i, list(i = as.name(x))),
family = binomial, data = df)
Coefficients:
(Intercept) personDom personFrancesca personMary
1.489e-16 -1.799e-16 1.957e+01 -1.957e+01
Degrees of Freedom: 7 Total (i.e. Null); 4 Residual
Null Deviance: 11.09
Residual Deviance: 5.545 AIC: 13.55
[[2]]
Call: glm(formula = substitute(outcome ~ i, list(i = as.name(x))),
family = binomial, data = df)
**End Snippet**
和lapply
,我创建了3个简单模型,而没有单独完成所有额外工作。当你有简单的代码时,你可能会问为什么要创建一个用户定义的函数?我计划运行glm
或while
循环,这样可以减少混乱。
感谢您的帮助
答案 0 :(得分:2)
我知道代码只有答案被弃用但我认为你几乎就在那里,可以使用nudge来使用formula
函数(并在替换中包含'the_response):
test.func<-function(some_data,the_response,the_predictors)
{
lapply(the_predictors,function(a)
{print( form<- formula(substitute(resp~i,
list(resp=as.name(the_response), i=as.name(a)))))
glm(form, data=some_data,family=binomial)
})
}
测试:
> test.func(df,dependent,independent)
outcome ~ person
<environment: 0x7f91a1ba5588>
outcome ~ hand
<environment: 0x7f91a2b38098>
outcome ~ age
<environment: 0x7f91a3fad468>
[[1]]
Call: glm(formula = form, family = binomial, data = some_data)
Coefficients:
(Intercept) personDom personFrancesca personMary
8.996e-17 -1.540e-16 1.957e+01 -1.957e+01
Degrees of Freedom: 7 Total (i.e. Null); 4 Residual
Null Deviance: 11.09
Residual Deviance: 5.545 AIC: 13.55
[[2]]
Call: glm(formula = form, family = binomial, data = some_data)
#snipped