Question

我使用R中的rpart包从训练数据构建了一个决策树。现在我有更多的数据，我想在树上检查它以检查模型。逻辑/迭代，我想做以下事情：

for each datapoint in new data
     run point thru decision tree, branching as appropriate
     examine how tree classifies the data point
     determine if the datapoint is a true positive or false positive

我如何在R中这样做？

Answer 1

为了能够使用它，我假设你将你的训练集分成一个子集训练集和一个测试集。

要创建训练模型，您可以使用：

model <- rpart(y~., traindata, minbucket=5)   # I suspect you did it so far.

将其应用于测试集：

pred <- predict(model, testdata)

然后，您可以获得预测结果的向量。

在您的训练测试数据集中，您也有“真实”的答案。让我们说一下训练集中的最后一列。

简单地将它们等同将产生结果：

pred == testdata[ , last]  # where 'last' equals the index of 'y'

当元素相等时，你会得到一个真，当你得到一个假，这意味着你的预测是错误的。

pred + testdata[, last] > 1 # gives TRUE positive, as it means both vectors are 1
pred == testdata[, last]    # gives those that are correct

看到你有多少百分比可能很有意思：

mean(pred == testdata[ , last])    # here TRUE will count as a 1, and FALSE as 0

如何根据R中的决策树模型测试数据？

1 个答案: