在适合最佳模型之前,我想使用插入符号来调整各种机器学习模型的超参数。但是我很难拟合最终模型并从模型中读取拟合。
目标:我有一个用df表示的模型矩阵,我想使用几种机器学习算法(lm,glmnet,gbm,svm和随机森林)来预测将来的值x。
这就是我正在尝试做的事情:\ 1.我使用了两种版本的时间控制(trainingTimeControl和fitTimeControl),其中一种使用时间片,另一种没有方法仅适合模型。 2.然后,我使用tunelength = 5和trainingTimeControl训练了模型。 3.我选择了bestTune并使用fitTimeControl拟合模型
这是一个最小的工作示例:
df=data.frame(date=seq(as.Date("2014/01/01"), by = "day", length.out = 1090),x=rnorm(1090),y=rnorm(1090),z=rnorm(1090))
df_training<-df[1:730,]
df_test<-df[731:nrow(df),]
glmnet.mod<-list()
glmnet.fit<-list()
# creating sampling seeds
set.seed(123)
n=nrow(dfml_training)
tuneLength.num <- 5
seeds <- vector(mode = "list", length = n) # creates an empty vector containing lists
for(i in 1:(n-1)){ # choose tuneLength.num random samples from 1 to 1000
seeds[[i]] <- sample.int(1000, tuneLength.num)
}
# For the last model:
seeds[[n]] <- sample.int(1000, 10)
trainingTimeControl <- trainControl(method = "timeslice",
initialWindow = trainingWindow,
horizon = 1,
fixedWindow = TRUE,
returnResamp="all",
allowParallel = TRUE,
seeds = seeds,
savePredictions = TRUE)
fitTimeControl <- trainControl(method = "none",
allowParallel = TRUE)
glmnet.mod <- caret::train(x ~ . - date,
data = df_training,
method = "glmnet",
family = "gaussian",
trControl = trainingTimeControl,
tuneLength=tuneLength.num)
glmnet.fit <- caret::train(x ~ . - date,
data = df_test,
method = "glmnet",
family = "gaussian",
trControl = fitTimeControl,
tuneGrid=glmnet.mod$bestTune)
lm.mod <- caret::train(x ~ . - date,
data = df_training,
method = "lm",
trControl = trainingTimeControl,
tuneLength=tuneLength.num)
lm.fit <- caret::train(x ~ . - date,
data = df_test,
method = "lm",
trControl = fitTimeControl,
tuneGrid=lm.mod$bestTune)
但是,我无法访问所有算法的拟合值。虽然lm的拟合值只是lm.fit $ finalModel $ fitted.values,而gbm的拟合值是gbm.fit $ finalModel $ fit,但我不知道例如glmnet和svm的等效值是什么。我是否使用我使用的fitTimeControl版本以正确的方式拟合最终模型?