提取的gbm最终模型不会返回与训练后的gbm模型相同的结果

时间:2018-10-25 09:11:17

标签: r r-caret gbm

我正在尝试使用从训练有素的gbm模型中提取的最终模型,但是提取的模型不会像训练有素的模型那样返回分解结果。似乎提取的最终模型根据返回的值工作,但是,它只是返回计算的值。如何获得分解结果作为训练模型。

library(caret)
library(mlbench)

data(Sonar)
set.seed(7)

Sonar$Class <- ifelse(Sonar$Class == 'R', 0, 1)
Sonar$Class <- as.factor(Sonar$Class)
validation_index <- createDataPartition(Sonar$Class, p=0.80, list=FALSE)
validation <- Sonar[-validation_index,]
training <- Sonar[validation_index,]
outcomename <- 'Class'
predictors <- names(training)[!names(training) %in% outcomename]

set.seed(7)
control <- trainControl(method = "repeatedcv",  number = 5,  repeats = 5)
model_gbm <- train(training[, predictors], training[, outcomename], method = 'gbm', trControl = control, tuneLength = 10)

predict(model_gbm, validation[,1:60])
[1] 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Levels: 0 1

predict(model_gbm$finalModel, validation[,1:60], n.trees = 300)
[1]  -3.1174531  -1.8335718   5.0780422  -8.6681791   8.9634393  -1.4079936  11.7232458
[8]  18.4189859  14.3978772  11.3605253  13.4694812  10.2752696  11.4957672  10.0370462
[15]   8.6009983   0.3718381   0.1297673   2.4099186   6.7774090 -10.8356795 -10.1842065
[22]  -2.3222431  -8.1525336  -3.3665867 -10.7953353  -2.4607156 -11.4277641  -4.7164270
[29]  -6.3882544  -3.7306579  -6.9323133  -4.2643347  -0.2128462  -9.3395850 -13.0759289
[36] -12.8259643  -6.5314340 -12.7968160 -16.6217507 -12.0370978  -3.1100361

1 个答案:

答案 0 :(得分:1)

predict.gbm函数具有一个type参数,该参数可以是“响应”或“链接”。为了获得预测的概率,应将其设置为“响应”。然后,要将这些预测转换为一个预测,可以使用阈值(插入符号火车使用0.5)。要获得启发,这里是一个示例:

library(caret)
library(mlbench)

data(Sonar)
set.seed(7)

validation_index <- createDataPartition(Sonar$Class, p=0.80, list=FALSE)
validation <- Sonar[-validation_index,]
training <- Sonar[validation_index,]

set.seed(7)
control <- trainControl(method = "repeatedcv",
                        number = 2,
                        repeats = 2)
model_gbm <- train(Class~.,
                   data = training,
                   method = 'gbm',
                   trControl = control,
                   tuneLength = 3)

使用插入符号进行预测:

preds1 <- predict(model_gbm, validation[,1:60], type = "prob")

使用gbm进行预测:

library(gbm)
preds2 <- predict(model_gbm$finalModel, validation[,1:60], n.trees = 100, type = "response")

all.equal(preds1[,1], preds2)
#output
TRUE

或上课:

preds1_class <- predict(model_gbm, validation[,1:60])

检查它们是否等于gbm预测阈值:

all.equal(
  as.factor(ifelse(preds2 > 0.5, "M", "R")),
  preds1_class)
#output
TRUE