在R中拟合一般度的多元多项式而不必编写显式公式

时间:2015-01-23 16:25:17

标签: r regression

我想将任意度的多元多项式和任意数量的变量拟合到某些数据中。变量的数量可以很高(例如40),并且代码应该适用于不同数量的变量(例如,10,20,40等),因此不可能明确地写出公式。对于1次多项式(即经典线性模型),解决方案是微不足道的:假设我在数据帧df中有我的数据,那么

mymodel <- lm(y ~ ., data = df)

不幸的是,当多项式具有任意度时,我不知道类似的紧凑公式。你能救我吗?

2 个答案:

答案 0 :(得分:2)

这结合了我之前发布的两个选项(交互和多项式术语),假设情况下列名称看起来像“X1”,“X2”,......,“X30”。您可以取出正在那里的terms()调用来证明它是成功的:

terms( as.formula( 
     paste(" ~ (", paste0("X", 1:30 , collapse="+"), ")^2", "+", 
            paste( "poly(", paste0("X", 1:30), ", degree=2)", 
                    collapse="+"), 
          collapse="")
      )         )

您可以使用names(dfrm)[!names(dfrm) %in% "y"]之类的表达式代替内部paste0次调用。

请注意,交互项是通过R公式处理与(...)^ 2机制构建的,该机制不是创建平方项,而是所有的双向交互:

as.formula( 
        paste(" ~ (", paste0("X", 1:30 , collapse="+"), ")^2", "+", paste( "poly(", paste0("X", 1:30), ", degree=2)", collapse="+"), collapse="")
        ) 
#----output----
~(X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X10 + X11 + X12 + 
    X13 + X14 + X15 + X16 + X17 + X18 + X19 + X20 + X21 + X22 + 
    X23 + X24 + X25 + X26 + X27 + X28 + X29 + X30)^2 + poly(X1, 
    degree = 2) + poly(X2, degree = 2) + 
    poly(X3, degree = 2) + 
    poly(X4, degree = 2) + poly(X5, degree = 2) + poly(X6, degree = 2) + 
    poly(X7, degree = 2) + poly(X8, degree = 2) + poly(X9, degree = 2) + 
    poly(X10, degree = 2) + poly(X11, degree = 2) + poly(X12, 
     degree = 2) + poly(X13, degree = 2) + poly(X14, degree = 2) + 
    poly(X15, degree = 2) + poly(X16, degree = 2) + poly(X17, 
     degree = 2) + poly(X18, degree = 2) + poly(X19, degree = 2) + 
    poly(X20, degree = 2) + poly(X21, degree = 2) + poly(X22, 
     degree = 2) + poly(X23, degree = 2) + poly(X24, degree = 2) + 
    poly(X25, degree = 2) + poly(X26, degree = 2) + poly(X27, 
     degree = 2) + poly(X28, degree = 2) + poly(X29, degree = 2) + 
    poly(X30, degree = 2)

答案 1 :(得分:0)

您可以使用此函数makepoly根据公式和数据框生成具有多项式项的公式。

makepoly <- function(form, data, degree = 1) {
  mt <- terms(form, data = data)
  tl <- attr(mt, "term.labels")
  resp <- tl[attr(mt, "response")]
  reformulate(paste0("poly(", tl, ", ", degree, ")"), 
              response = form[[2]])
}

测试数据集:

set.seed(1)
df <- data.frame(y = rnorm(10), 
                 x1 = rnorm(10), x2 = rnorm(10), x3 = rnorm(10))

创建公式并运行回归:

form <- makepoly(y ~ ., df, degree = 2)
# y ~ poly(x1, 2) + poly(x2, 2) + poly(x3, 2)


lm(form, df)
# 
# Call:
#   lm(formula = form, data = df)
# 
# Coefficients:
#   (Intercept)  poly(x1, 2)1  poly(x1, 2)2  poly(x2, 2)1  
# 0.1322        0.1445       -5.5757       -5.2132  
# poly(x2, 2)2  poly(x3, 2)1  poly(x3, 2)2  
# 4.2297        0.7895        3.9796