R代码:滚动回归逐步后

时间:2013-10-28 06:00:49

标签: r

我今天花了一整天来解决这个问题..请帮助我。 虽然我在这里只写了一个非常简单的例子,但我的原始数据有很多变量 - 大约2,000个。因此,要运行回归,我需要选择某些变量。 我确实需要开发很多模型,所以我应该自动执行这个过程。

  1. 我跑步了。
  2. 我不知道逐步选择了多少变量。
  3. 选择变量后,我会进行滚动回归预测。

     library(car)
     library(zoo)
     # run regression
    m <- lm(mpg~., data=mtcars) 
    
     # run stepwise
    s<-step(m, direction="both")
    
    # select variables
    variable<- attr(s$terms,"term.labels")
    b<-paste(dep,paste(s, collapse="+"),sep = "~")
    
    rollapply(mtcars, width = 2,
              FUN = function(z) coef(lm(b, data = as.data.frame(z))),
              by.column = FALSE, align = "right")
    

    #这是我开发的自动模型..

    models2 <- lapply(1:11, function(x) {
      dep<-names(mtcars)[x]
      ind<-mtcars[-x]
      w<-names(ind)
      indep<-paste(dep,paste(w, collapse="+"),sep = "~")
      m<-lm(indep,data=mtcars)
      s<-step(m, direction="both")
      b<-paste(dep,paste(s, collapse="+"),sep = "~")
      rollapply(mtcars, width = 2,
              FUN = function(z) coef(lm(b, data = as.data.frame(z))),
              by.column = FALSE, align = "right")})
    
  4. 我想从滚动回归计算预测。

    但是,设置起来非常困难 data.frame没有关于自变量的预先知识..

    There is a similar one here, but in this model independent variables are known already.

1 个答案:

答案 0 :(得分:0)

您不需要知道自变量!如果您提供包含所有变量的data.framepredict函数将选择必要的变量。与您链接的帖子类似,您可以这样:

mtcars[,"int"] <- seq(nrow(mtcars)) # add variable used to choose newdata
models2 <- lapply(1:11, function(x) {
  dep <- names(mtcars)[x]
  ind <- mtcars[-x]
  w <- names(ind)
  form <- paste(dep,paste(w, collapse="+"),sep = "~")
  m <- lm(form, data=mtcars)
  s <- step(m, direction="both", trace=0) # model selection (don't print trace)
  b <- formula(s) # This is clearer than your version
  rpl <- rollapply(mtcars, width = 20, # if you use width=2, your model will always be overdetermined
          FUN = function(z) {
            nextD <- max(z[,'int'])+1 # index of row for new data
            fit <- lm(b, data = as.data.frame(z)) # fit the model 
            c(coef=coef(fit), # coefficients
              predicted=predict(fit, newdata=mtcars[nextD,])) # predict using the next row
          },
          by.column = FALSE, align = "right")
  rpl
})