我正在r。
中使用插入符号包构建C5.0模型control <- trainControl(method = "repeatedcv",
number = 10,
repeats = 3,
classProbs = TRUE,
sampling = 'smote',
returnResamp="all",
summaryFunction = twoClassSummary)
grid <- expand.grid(.winnow = c(FALSE, TRUE),
.trials = c(1, 5,10,15,20,25,30,40,45,50),
.model= c("tree"),
.splits=c(2,5,10,15,20,25,50))
c5_model <- train(label ~ .,
data = train,
trControl = control,
method = c5info,
tuneGrid = grid,
preProcess = c("center", "scale", "nzv","corr"),
verbose = FALSE)
是否可以将自定义截止点传递给preProcess函数以进行相关 - 比如0.75或我想要的任何点?
答案 0 :(得分:1)
您可以在ID VERSION FEATURE STARTDATE ENDDATE
1 0.100000 A 01-01-2018 15-03-2018
2 0.100000 B 01-01-2018 15-03-2018
3 0.100000 C 01-01-2018 15-03-2018
4 0.200000 A 15-03-2018 9999-12-31
5 0.200000 B 15-03-2018 9999-12-31
6 0.200000 D 15-03-2018 9999-12-31
中指定预处理选项:
trainControl
一些游侠模型:
library(caret)
library(mlbench) #for the data
data(Sonar)
ctrl <-trainControl(method = "repeatedcv",
number = 10,
repeats = 3,
classProbs = TRUE,
sampling = 'smote',
returnResamp="all",
summaryFunction = twoClassSummary,
preProcOptions = list(cutoff = 0.75)) # all go in this list
使用不同的截止值:
grid <- expand.grid(.mtry = c(2,5,10),
.min.node.size = 2,
.splitrule = "gini")
fit_model <- train(Class ~ .,
data = Sonar,
trControl = ctrl,
metric = "ROC",
method = "ranger",
tuneGrid = grid,
preProcess = c("center", "scale", "nzv","corr"),
verbose = FALSE)
fit_model$preProcess
#output
Created from 679 samples and 60 variables
Pre-processing:
- centered (26)
- ignored (0)
- removed (34)
- scaled (26)
删除了更多列
当我们使用ctrl2 <-trainControl(method = "repeatedcv",
number = 10,
repeats = 3,
classProbs = TRUE,
sampling = 'smote',
returnResamp="all",
summaryFunction = twoClassSummary,
preProcOptions = list(cutoff = 0.6))
fit_model2 <- train(Class ~ .,
data = Sonar,
trControl = ctrl2,
metric = "ROC",
method = "ranger",
tuneGrid = grid,
preProcess = c("center", "scale", "nzv","corr"),
verbose = FALSE)
fit_model2$preProcess
#output
Created from 679 samples and 60 variables
Pre-processing:
- centered (23)
- ignored (0)
- removed (37)
- scaled (23)
preProcOptions = list(cutoff = 0.95))
看起来很有效。
同样,您可以传递任何其他预处理选项:
fit_model3$preProcess
#output
Created from 679 samples and 60 variables
Pre-processing:
- centered (55)
- ignored (0)
- removed (5)
- scaled (55)
检查所有这些