我试图在不同的数据集上运行anova,并且不太清楚如何做到这一点。我骂了一遍,发现这很有用:https://stats.idre.ucla.edu/r/codefragments/looping_strings/
hsb2 <- read.csv("https://stats.idre.ucla.edu/stat/data/hsb2.csv")
names(hsb2)
varlist <- names(hsb2)[8:11]
models <- lapply(varlist, function(x) {
lm(substitute(read ~ i, list(i = as.name(x))), data = hsb2)
})
我对上述代码的作用的理解是它创建了一个函数lm()并将其应用于varlist中的每个变量,并对每个变量进行线性回归。
所以我认为使用aov而不是lm对我来说会像这样:
aov(substitute(read ~ i, list(i = as.name(x))), data = hsb2)
但是,我收到了这个错误:
Error in terms.default(formula, "Error", data = data) :
no terms component nor attribute
我不知道错误的来源。请帮忙!
答案 0 :(得分:5)
这应该这样做。 varlist向量将逐项传递给函数,并且将传递列。 lm函数只能看到两列数据帧和&#34; read&#34;列每次都是因变量。不需要花哨的替代品:
models <- sapply(varlist, function(x) {
lm(read ~ ., data = hsb2[, c("read", x) ])
}, simplify=FALSE)
> summary(models[[1]]) # The first model. Note the use of "[["
Call:
lm(formula = read ~ ., data = hsb2[, c("read", x)])
Residuals:
Min 1Q Median 3Q Max
-19.8565 -5.8976 -0.8565 5.5801 24.2703
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 18.16215 3.30716 5.492 1.21e-07 ***
write 0.64553 0.06168 10.465 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 8.248 on 198 degrees of freedom
Multiple R-squared: 0.3561, Adjusted R-squared: 0.3529
F-statistic: 109.5 on 1 and 198 DF, p-value: < 2.2e-16
对于所有模型::
lapply(models, summary)
答案 1 :(得分:5)
问题是substitute()
返回表达式,而不是公式。我认为@ thelatemail建议
lm(as.formula(paste("read ~",x)), data = hsb2)
是一个很好的解决方法。或者,您可以评估表达式以获取
的公式models <- lapply(varlist, function(x) {
aov(eval(substitute(read ~ i, list(i = as.name(x)))), data = hsb2)
})
我想这取决于你之后要对模型列表做什么。做
models <- lapply(varlist, function(x) {
eval(bquote(aov(read ~ .(as.name(x)), data = hsb2)))
})
为每个结果提供“更清晰”的call
属性。
答案 2 :(得分:4)
do.call
将变量放入call
输出,以便正确读取。这是简单回归的一般函数。
doModel <- function(col1, col2, data = hsb2, FUNC = "lm")
{
form <- as.formula(paste(col1, "~", col2))
do.call(FUNC, list(form, substitute(data)))
}
lapply(varlist, doModel, col1 = "read")
# [[1]]
#
# Call:
# lm(formula = read ~ write, data = hsb2)
#
# Coefficients:
# (Intercept) write
# 18.1622 0.6455
#
#
# [[2]]
#
# Call:
# lm(formula = read ~ math, data = hsb2)
#
# Coefficients:
# (Intercept) math
# 14.0725 0.7248
#
# ...
# ...
# ...
注意:正如电子邮件在评论中提及
sapply(varlist, doModel, col1 = "read", simplify = FALSE)
会将名称保留在列表中,并允许list$name
子集化。