Question

鉴于：

data(iris)
fit <- rpart(Species~., iris)
predict(fit)

这是否会对训练数据进行交叉验证预测？

我没有在rpart文档中找到任何CV预测的确认。

10倍

Answer 1

使用predict(fit)，您可以获得训练数据集的预测类概率（对于分类树;回归树的均值）。用于该预测的树是

所示的树

fit

## n= 150 
## 
## node), split, n, loss, yval, (yprob)
##       * denotes terminal node
## 
## 1) root 150 100 setosa (0.33333333 0.33333333 0.33333333)  
##   2) Petal.Length< 2.45 50   0 setosa (1.00000000 0.00000000 0.00000000) *
##   3) Petal.Length>=2.45 100  50 versicolor (0.00000000 0.50000000 0.50000000)  
##     6) Petal.Width< 1.75 54   5 versicolor (0.00000000 0.90740741 0.09259259) *
##     7) Petal.Width>=1.75 46   1 virginica (0.00000000 0.02173913 0.97826087) *

在拟合该树期间，还进行交叉验证，例如，查看

fit$cptable

##     CP nsplit rel error xerror       xstd
## 1 0.50      0      1.00   1.16 0.05127703
## 2 0.44      1      0.50   0.70 0.06110101
## 3 0.01      2      0.06   0.09 0.02908608

因此，在这种情况下，拟合也具有最低的交叉验证错误（请参阅xerror列）。在其他数据集上，您可能需要应用一些额外的修剪或使用1-SE修剪规则等。

预测的意义（rpart.model）

1 个答案: