我试图通过模拟演示不同模型和特征选择技术的性能,所以我希望以编程方式将各种参数传递给glm()
。
在?glm
下,我们读到了(italics mine):
family :描述模型中使用的错误分布和链接函数。对于glm,这可以是命名a的字符串 家庭功能,家庭功能或家庭功能呼叫的结果。对于glm.fit,仅支持第三个选项。 (有关家庭功能的详细信息,请参见家庭。)
问题在于,当我在生成的模型上调用step()
时,似乎存在范围问题并且family=
参数不再被识别。
这是一个最小的例子:
getCoef <- function(formula,
family = c("gaussian", "binomial"),
data){
model_fam <- match.arg(family, c("gaussian", "binomial"))
fit_null <- glm(update(formula,".~1"),
family = model_fam,
data = data)
message("So far so good")
fit_stepBIC <- step(fit_null,
formula,
direction="forward",
k = log(nrow(data)),
trace=0)
message("Doesn't make it this far")
fit_stepBIC$coefficients
}
# returns error 'model_fam' not found
getCoef(Petal.Length ~ Petal.Width + Species, family = "gaussian", data = iris)
带回溯的错误消息:
> getCoef(Petal.Length ~ Petal.Width + Species, family = "gaussian", data = iris)
So far so good
Error in stats::glm(formula = Petal.Length ~ Petal.Width + Species, family = model_fam, :
object 'model_fam' not found
9 stats::glm(formula = Petal.Length ~ Petal.Width + Species, family = model_fam,
data = data, method = "model.frame")
8 eval(expr, envir, enclos)
7 eval(fcall, env)
6 model.frame.glm(fob, xlev = object$xlevels)
5 model.frame(fob, xlev = object$xlevels)
4 add1.glm(fit, scope$add, scale = scale, trace = trace, k = k,
...)
3 add1(fit, scope$add, scale = scale, trace = trace, k = k, ...)
2 step(fit_null, formula, direction = "forward", k = log(nrow(data)),
trace = 0)
1 getCoef(Petal.Length ~ Petal.Width + Species, family = "gaussian",
data = iris)
> sessionInfo()
R version 3.2.4 Revised (2016-03-16 r70336)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] rsconnect_0.4.1.11 tools_3.2.4
传递此参数的最自然方式是什么,以便通过步骤识别?我知道的一种可能的解决方法是通过glm()
上的if-then-else条件,使用显式系列名称调用model_fam
。
答案 0 :(得分:2)
我认为基于bquote
,.()
和getCoef <- function(formula,
family = c("gaussian", "binomial"),
data){
model_fam <- match.arg(family, c("gaussian", "binomial"))
fit_null <- eval(bquote(
glm(update(.(formula),".~1"),
family = .(model_fam),
data = .(data))))
message("So far so good")
fit_stepBIC <- step(fit_null,
formula,
direction="forward",
k = log(nrow(data)),
trace=0)
message("Doesn't make it this far")
fit_stepBIC$coefficients
}
# returns error 'model_fam' not found
getCoef(formula = Petal.Length ~ Petal.Width + Species,
family = "gaussian",
data = iris)
So far so good
Doesn't make it this far
(Intercept) Speciesversicolor Speciesvirginica Petal.Width
1.211397 1.697791 2.276693 1.018712
的以下解决方案可能会解决您的问题。
我也安装了R-version 3.2.4,我从你的代码中得到了完全相同的错误。下面的解决方案使它在我的电脑上运行。
{{1}}
答案 1 :(得分:1)
问题在于step
最终调用model.frame
和model.frame
在特殊环境中评估术语对象,即定义公式的环境。这通常是调用getCoef
的环境。但是在这种环境中model_fam
不存在,因为它是在getCoef
内定义的。解决这个问题的一种方法是添加
environment(formula) <- environment()
之后
model_fam <- match.arg(family, c("gaussian", "binomial"))
或者那种效果。