Question

请考虑以下数据：

y<- c(2,2,6,3,2,23,5,6,4,23,3,4,3,87,5,7,4,23,3,4,3,87,5,7)
x1<- c(3,4,6,3,3,23,5,6,4,23,6,5,5,1,5,7,2,23,6,5,5,1,5,7)
x2<- c(7,3,6,3,2,2,5,2,2,2,2,2,6,5,4,3,2,3,2,2,6,5,4,3)

type <- c("a","a","a","a","a","a","a","a","b","b","b","b","b","b","b","b","c","c","c","c","c","c","c","c")
generation<- c(1,1,1,1,2,2,3,3,1,2,2,2,3,3,4,4,1,2,2,2,3,3,4,4)
year<-         c(2004,2005,2006,2007,2008,2009,2010,2011,2004,2005,2006,2007,2008,2009,2010,2011,2004,2005,2006,2007,2008,2009,2010,2011)
data        <- data.frame(y,x1,x2,model,generation,year)

我现在做的分析只考虑每一年并预测以下内容。所以从本质上讲，这将进行几次单独的分析，只考虑最多一个时间点的数据，然后预测下一个（仅直接下一个）时期。

我尝试为这三个模型设置一个示例：

data2004 <- subset(data, year==2004)
data2005 <- subset(data, year==2005)
m1 <- lm(y~x1+x2, data=data2004)
preds <- predict(m1, data2005)

我该如何自动执行此操作？我的首选输出将是每种类型的预测值，该预测值指示对于下一时段中存在的每个值的值（原始数据具有200个周期）。

在此先感谢，非常感谢！

Answer 1

以下可能更像你想要的。

uq.year <- sort(unique(dat$year)) ## sorting so that i+1 element is the year after ith element
year <- dat$year
dat$year <- NULL ## we want everything in dat to be either the response or a predictor

model <- rep(c("a", "b", "c"), times = length(year) / 3) ## identifies the separate people per year

predlist <- vector("list", length(uq.year) - 1) ## there is 1 prediction fewer than the number of unique years

for(i in 1:(length(uq.year) - 1))
{
  mod <- lm(y ~ ., data = subset(dat, year == uq.year[i]))
  predlist[[i]] <- predict(mod, subset(dat, subset = year == uq.year[i + 1], select = -y))      
  names(predlist[[i]]) <- model[year == uq.year[i + 1]] ## labeling each prediction
}

我们希望dat仅包含建模变量（而不是year）的原因是因为我们可以轻松使用y ~ .符号并避免拼出lm电话中的所有预测变量。

获取下一期间的预测值

1 个答案: