我一直在使用leapForward方法从包跳跃与插入符号一起跳过,发现它只提供了5个变量。根据跳跃包,您可以将nvmax更改为您希望的任何数量的子集。
我似乎无法将其放入插入符号封装器中。我已经尝试将它放入火车声明,以及创建expand.grid线,并且ti似乎不起作用。任何帮助将不胜感激!
我的代码:
library(caret)
data <- read.csv(file="C:/mydata.csv", header=TRUE, sep=",")
fitControl <- trainControl(method = "loocv")
x <- data[, -19]
y <- data[, 19]
lmFit <- train(x=x, y=y,'leapForward', trControl = fitControl)
summary(lmFit)
答案 0 :(得分:0)
插入符的默认行为是对调整参数的随机搜索。
您可以使用tuneGrid
选项指定参数网格。
以下是BloodBrain数据集的可重现示例。 NB:我必须这样做 用PCA转换预测变量以避免多线性问题
library(caret)
data(BloodBrain, package = "caret")
dim(bbbDescr)
#> [1] 208 134
X <- princomp(bbbDescr)$scores[,1:131]
Y <- logBBB
fitControl <- trainControl(method = "cv")
默认:随机搜索参数
lmFit <- train(y = Y, x = X,'leapForward', trControl = fitControl)
lmFit
#> Linear Regression with Forward Selection
#>
#> 208 samples
#> 131 predictors
#>
#> No pre-processing
#> Resampling: Cross-Validated (10 fold)
#> Summary of sample sizes: 187, 188, 187, 187, 187, 187, ...
#> Resampling results across tuning parameters:
#>
#> nvmax RMSE Rsquared MAE
#> 2 0.6682545 0.2928583 0.5286758
#> 3 0.7008359 0.2652202 0.5527730
#> 4 0.6781190 0.3026475 0.5215527
#>
#> RMSE was used to select the optimal model using the smallest value.
#> The final value used for the model was nvmax = 2.
使用您选择的网格搜索。
注意:此处不需要expand.grid
。它结合起来很有用
几个调整参数
lmFit <- train(y = Y, x = X,'leapForward', trControl = fitControl,
tuneGrid = expand.grid(nvmax = seq(1, 30, 2)))
lmFit
#> Linear Regression with Forward Selection
#>
#> 208 samples
#> 131 predictors
#>
#> No pre-processing
#> Resampling: Cross-Validated (10 fold)
#> Summary of sample sizes: 188, 188, 188, 186, 187, 187, ...
#> Resampling results across tuning parameters:
#>
#> nvmax RMSE Rsquared MAE
#> 1 0.7649633 0.07840817 0.5919515
#> 3 0.6952295 0.27147443 0.5250173
#> 5 0.6482456 0.35953363 0.4828406
#> 7 0.6509919 0.37800159 0.4865292
#> 9 0.6721529 0.35899937 0.5104467
#> 11 0.6541945 0.39316037 0.4979497
#> 13 0.6355383 0.42654189 0.4794705
#> 15 0.6493433 0.41823974 0.4911399
#> 17 0.6645519 0.37338055 0.5105887
#> 19 0.6575950 0.39628133 0.5084652
#> 21 0.6663806 0.39156852 0.5124487
#> 23 0.6744933 0.38746853 0.5143484
#> 25 0.6709936 0.39228681 0.5025907
#> 27 0.6919163 0.36565876 0.5209107
#> 29 0.7015347 0.35397968 0.5272448
#>
#> RMSE was used to select the optimal model using the smallest value.
#> The final value used for the model was nvmax = 13.
plot(lmFit)
由reprex package(v0.2.0)创建于2018-03-08。