我正在阅读ISLR书,我正在尝试使用10倍交叉验证为Ridge回归模型找到最佳lambda。我尝试过具有非常相似配置的cv.glmnet
和caret
train
函数,但结果却截然不同:
library(ISLR)
library(glmnet)
library(caret)
hit = na.omit(Hitters)
grid = 10 ^ seq(10, -2, length= 100) # from 10B to .01, the grid for lambda
x = model.matrix(Salary ~ ., hit)[,-1]
y = hit$Salary
train.x = x[inTrain,]
train.y = y[inTrain]
test.x = x[-inTrain,]
test.y = y[-inTrain]
cv.glmnet
set.seed(1)
cv.out = cv.glmnet(x, y, alpha = 0, nfolds = 10,
type.measure = "mse", lambda = grid)
best.lam = cv.out$lambda.min
best.lam
#231.013
cv.out$cvm[which(cv.out$lambda == cv.out$lambda.min)]
# mean cross-validated MSE: 120385.5
caret
ridge.model.caret = train(x, y, method = "glmnet",
tuneGrid = expand.grid(alpha = 0, lambda = grid),
tuneLength = 100, metric = "RMSE",
trControl = trainControl(method = "cv", number = 10))
ridge.model.caret$bestTune
# 18.73
ridge.model.caret$results$RMSE[which.min(ridge.model.caret$results$RMSE)]^2
# mean cross-validated MSE: 108339
请你帮我弄清楚我错过了什么?这是否意味着这两种不同的lambda具有足够接近的结果?