Question

我有几个使用相同的trainControl创建的预测模型。这些模型必须事先创建（即，我不能使用caretList同时训练多个模型）。

下面是我的最小示例。当我手动组合多个（已经创建的）模型并将它们传递给caretStack时，

library("kernlab")
library("rpart")
library("caret")
library("caretEnsemble")

trainingControl <- trainControl(method='cv', number=10, savePredictions = "final", classProbs=TRUE)
data(spam)
ds <- spam
tr <- ds[sample(nrow(ds),3221),]
te <- ds[!(rownames(ds) %in% rownames(tr)),]
model <- train(tr[,-58], tr$type, 'svmRadial', trControl = trainingControl)
model2 <- train(tr[,-58], tr$type, 'rpart', trControl = trainingControl)
multimodel <- list(svm = model, nb = model2)
class(multimodel) <- "caretList"
stack <- caretStack(multimodel, method = "rf", metric = "ROC", trControl = trainingControl)

库引发错误：

Component models do not have the same re-sampling strategies。

为什么为什么要使用相同的策略来生成基本模型？

我在github讨论zachmayer/caretEnsemble/issues/104中发现了对caretList类的“广播”。

Answer 1

您快到了。要记住的一件事是，当您要使用caretEnsemble时，必须在trainControl中通过trainControl中的'index'选项设置重采样索引。如果运行caretList，它往往会自行设置它，但是最好自己进行设置。当您在caretList之外运行其他模型时，尤其如此。您需要确保重采样相同。您还可以在您引用的github上的示例中看到这一点。

trainingControl <- trainControl(method='cv', 
                                number=10, 
                                savePredictions = "final", 
                                classProbs=TRUE, 
                                index=createResample(tr$type)) # this needs to be set.

这将确保您的代码将运行。

请注意，在您提供的示例代码中，它将返回错误。

caretEnsemble：组件模型没有相同的重采样策略

1 个答案: