获取下一期间的预测值

时间:2014-08-22 11:59:48

标签: r time-series prediction

请考虑以下数据:

y<- c(2,2,6,3,2,23,5,6,4,23,3,4,3,87,5,7,4,23,3,4,3,87,5,7)
x1<- c(3,4,6,3,3,23,5,6,4,23,6,5,5,1,5,7,2,23,6,5,5,1,5,7)
x2<- c(7,3,6,3,2,2,5,2,2,2,2,2,6,5,4,3,2,3,2,2,6,5,4,3)

type <- c("a","a","a","a","a","a","a","a","b","b","b","b","b","b","b","b","c","c","c","c","c","c","c","c")
generation<- c(1,1,1,1,2,2,3,3,1,2,2,2,3,3,4,4,1,2,2,2,3,3,4,4)
year<-         c(2004,2005,2006,2007,2008,2009,2010,2011,2004,2005,2006,2007,2008,2009,2010,2011,2004,2005,2006,2007,2008,2009,2010,2011)
data        <- data.frame(y,x1,x2,model,generation,year)

我现在做的分析只考虑每一年并预测以下内容。所以从本质上讲,这将进行几次单独的分析,只考虑最多一个时间点的数据,然后预测下一个(仅直接下一个)时期。

我尝试为这三个模型设置一个示例:

data2004 <- subset(data, year==2004)
data2005 <- subset(data, year==2005)
m1 <- lm(y~x1+x2, data=data2004)
preds <- predict(m1, data2005)

我该如何自动执行此操作?我的首选输出将是每种类型的预测值,该预测值指示对于下一时段中存在的每个值的值(原始数据具有200个周期)。

在此先感谢,非常感谢!

1 个答案:

答案 0 :(得分:1)

以下可能更像你想要的。

uq.year <- sort(unique(dat$year)) ## sorting so that i+1 element is the year after ith element
year <- dat$year
dat$year <- NULL ## we want everything in dat to be either the response or a predictor

model <- rep(c("a", "b", "c"), times = length(year) / 3) ## identifies the separate people per year

predlist <- vector("list", length(uq.year) - 1) ## there is 1 prediction fewer than the number of unique years

for(i in 1:(length(uq.year) - 1))
{
  mod <- lm(y ~ ., data = subset(dat, year == uq.year[i]))
  predlist[[i]] <- predict(mod, subset(dat, subset = year == uq.year[i + 1], select = -y))      
  names(predlist[[i]]) <- model[year == uq.year[i + 1]] ## labeling each prediction
}

我们希望dat仅包含建模变量(而不是year)的原因是因为我们可以轻松使用y ~ .符号并避免拼出lm电话中的所有预测变量。