我正在通过以下链接执行练习6.9:https://notendur.hi.is/map27/ISLR/ISLRChapter6.html
具体来说,教师以这种方式创建训练和测试数据集:
set.seed(1)
trainRows = sample(dim(College)[1], ceiling(dim(College)[1]/2))
train = is.element(c(1:dim(College)[1]),trainRows)
test = !train
现在“train”和“test”对象是逻辑运算符。现在,这些培训和测试对象适用于lm
,ridge
和lasso
。
#Linear Regression
fit = lm(Apps~., data=College[train, ])
fit.pred = predict(fit, College[test, ])
mean((College[test, ][, "Apps"] - fit.pred)^2)
#Ridge Regression
trainMat = model.matrix(Apps~., data=College[train, ])
testMat = model.matrix(Apps~., data=College[test, ])
grid = 10 ^ seq(10, -10, length=100)
ridgeModel = cv.glmnet(trainMat, College[train, ][, "Apps"], alpha=0,
lambda=grid)
optLambda = ridgeModel$lambda.min
optLambda
#Ridge MSE
ridgePred = predict(ridgeModel, newx=testMat, s=optLambda)
mean((College[test, ][, "Apps"] - ridgePred)^2)
#Lasso Model
lassoModel = cv.glmnet(trainMat, College[train, ][, "Apps"], alpha=1,
lambda=grid)
optLambda = lassoModel$lambda.min
optLambda
#Test MSE - Lasso
lassoPred = predict(lassoModel, newx=testMat, s=optLambda)
mean((College[test, ][, "Apps"] - lassoPred)^2)
但是,当我尝试将这些对象与regsubsets
函数一起使用时,我收到以下消息:
regfit.full=regsubsets(Apps ~ ., data = train)
Error in terms.formula(formula, data = data) :
'.' in formula and no 'data' argument
但是,当我按照以下方式创建培训/测试对象时(使用ridge
<{1>}和lasso
回归 工作 包),它与glmnet
regsubset
我现在有两次训练&amp;测试集,这显然不理想。有没有办法我只能创建一个可以同时使用index = sample(1:nrow(College), size=0.5*nrow(College))
train_2 = College[index,]
test_2 = College[-index,]
和regsubsets
的培训/测试集?
由于