我想将任意度的多元多项式和任意数量的变量拟合到某些数据中。变量的数量可以很高(例如40),并且代码应该适用于不同数量的变量(例如,10,20,40等),因此不可能明确地写出公式。对于1次多项式(即经典线性模型),解决方案是微不足道的:假设我在数据帧df中有我的数据,那么
mymodel <- lm(y ~ ., data = df)
不幸的是,当多项式具有任意度时,我不知道类似的紧凑公式。你能救我吗?
答案 0 :(得分:2)
这结合了我之前发布的两个选项(交互和多项式术语),假设情况下列名称看起来像“X1”,“X2”,......,“X30”。您可以取出正在那里的terms()调用来证明它是成功的:
terms( as.formula(
paste(" ~ (", paste0("X", 1:30 , collapse="+"), ")^2", "+",
paste( "poly(", paste0("X", 1:30), ", degree=2)",
collapse="+"),
collapse="")
) )
您可以使用names(dfrm)[!names(dfrm) %in% "y"]
之类的表达式代替内部paste0
次调用。
请注意,交互项是通过R公式处理与(...)^ 2机制构建的,该机制不是创建平方项,而是所有的双向交互:
as.formula(
paste(" ~ (", paste0("X", 1:30 , collapse="+"), ")^2", "+", paste( "poly(", paste0("X", 1:30), ", degree=2)", collapse="+"), collapse="")
)
#----output----
~(X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X10 + X11 + X12 +
X13 + X14 + X15 + X16 + X17 + X18 + X19 + X20 + X21 + X22 +
X23 + X24 + X25 + X26 + X27 + X28 + X29 + X30)^2 + poly(X1,
degree = 2) + poly(X2, degree = 2) +
poly(X3, degree = 2) +
poly(X4, degree = 2) + poly(X5, degree = 2) + poly(X6, degree = 2) +
poly(X7, degree = 2) + poly(X8, degree = 2) + poly(X9, degree = 2) +
poly(X10, degree = 2) + poly(X11, degree = 2) + poly(X12,
degree = 2) + poly(X13, degree = 2) + poly(X14, degree = 2) +
poly(X15, degree = 2) + poly(X16, degree = 2) + poly(X17,
degree = 2) + poly(X18, degree = 2) + poly(X19, degree = 2) +
poly(X20, degree = 2) + poly(X21, degree = 2) + poly(X22,
degree = 2) + poly(X23, degree = 2) + poly(X24, degree = 2) +
poly(X25, degree = 2) + poly(X26, degree = 2) + poly(X27,
degree = 2) + poly(X28, degree = 2) + poly(X29, degree = 2) +
poly(X30, degree = 2)
答案 1 :(得分:0)
您可以使用此函数makepoly
根据公式和数据框生成具有多项式项的公式。
makepoly <- function(form, data, degree = 1) {
mt <- terms(form, data = data)
tl <- attr(mt, "term.labels")
resp <- tl[attr(mt, "response")]
reformulate(paste0("poly(", tl, ", ", degree, ")"),
response = form[[2]])
}
测试数据集:
set.seed(1)
df <- data.frame(y = rnorm(10),
x1 = rnorm(10), x2 = rnorm(10), x3 = rnorm(10))
创建公式并运行回归:
form <- makepoly(y ~ ., df, degree = 2)
# y ~ poly(x1, 2) + poly(x2, 2) + poly(x3, 2)
lm(form, df)
#
# Call:
# lm(formula = form, data = df)
#
# Coefficients:
# (Intercept) poly(x1, 2)1 poly(x1, 2)2 poly(x2, 2)1
# 0.1322 0.1445 -5.5757 -5.2132
# poly(x2, 2)2 poly(x3, 2)1 poly(x3, 2)2
# 4.2297 0.7895 3.9796