Question

我使用rpart来获取我的数据的分类模型，但我不知道如何分配桶大小以避免过度装配或不合适的模型。为了获得最佳的桶大小，我读到使用插入符号的包训练方法提供了一种获得最佳桶的方法，因此在R中实现了几行：

tree <- rpart(y ~ x1 + x2 + x3 + x4 + x5 + x6, method = 'class', data = train, minbucket = 15) - (I have anonymized the formula of my model)
numfolds <- trainControl(method = "cv", number = 10)
cpGrid <- expand.grid(.cp = seq(0.0001, 0.005, 0.0001))
train(y ~ x1 + x2 + x3 + x4 + x5 + x6, data = train, method = "rpart", trControl = numfolds, tuneGrid = cpGrid)

打印输出给出：

RMSE was used to select the optimal model using  the smallest value.
The final value used for the model was cp = 0.0024.

好的，所以我注意并在我的rpart模型中使用了cp = 0.0024

treeCV <- rpart(y ~ x1 + x2 + x3 + x4 + x5 + x6, method = 'class', data = train, cp = 0.0024)
prp(treeCV)

我只有＃34; prp＆＃34;可视化。

有任何帮助吗？如果需要更多信息，请告诉我。

使用插入符号

0 个答案: