使用正向逐步和交叉验证绘制预测误差与模型大小的关系

时间:2018-11-13 04:29:52

标签: r regression

在进行正向选择后,我需要使用交叉验证按模型大小绘制预测误差。我将数据分为两半,并使用了jumps包来找到每种尺寸的最佳模型。但是,我无法弄清楚如何获得必要的预测误差。我尝试的代码给出了一个错误: 在val_matrix [,names(coefi)]中发现1个线性依赖项错误:下标超出范围

n = 400
p = 200
s = 10
X = matrix(rnorm(n*p),n,p)
X = scale(X, center = FALSE, scale = sqrt(colSums(X^2)))
beta = c(rep(5,10), rep(0,p-10))
Y = X%*%beta + rnorm(n)
tr <- sample(1:400, 200, replace = FALSE)
train <- X[tr,]
validation <- X[-tr,]
d <- regsubsets(Y[tr,]~train, nvmax=30, data = as.data.frame(train), method = c("forward"))
val_matrix <- model.matrix(Y[-tr,]~validation, data = as.data.frame(validation))
val_errors = rep(0,30)

for (i in 1:30){
        coefi = coef(d, id=i)
        predi = val_matrix[,names(coefi)]%*%coefi
        val_errors[i] = Y[-tr,] - predi
}

0 个答案:

没有答案