fit <- rpart(unacc~., data = carTrain, method = 'class')
我已经在carTrain上创建了决策树。
和
上的预测predict_unseen <- predict(fit,carTest, type = 'class')
此处c arTest
是无法预测的数据
现在我正在创建一个混淆矩阵
confusionMatrix(carTest$unacc,predict_unseen)
我遇到了错误
confusionMatrix(carTest$unacc,predict_unseen)
confusionMatrix.default(carTest $ unacc,predict_unseen)中的错误: 数据的级别不能超过参考
答案 0 :(得分:0)
library(rpart)
library(imptree)
data(carEvaluation)
table(carEvaluation$acceptance)
> table(carEvaluation$acceptance)
acc good unacc vgood
384 69 1210 65
请注意,unacc
只是acceptance
属性中的类别之一。
因此您可以执行以下操作:
{set.seed(3456)
train <- caret::createDataPartition(carEvaluation$acceptance, p = .8, # partition 80%~20%
list = FALSE)
carTrain <- carEvaluation[train,]
carTest <- carEvaluation[-train,]
fit <- rpart::rpart(acceptance~., data = carTrain, method = 'class')
}
df <- data.frame(obs = carTest$acceptance,
predict(fit, newdata = carTest, type = "class"))
cfm <- caret::confusionMatrix(df$predict.fit..newdata...carTest..type....class.., df$obs)
cfm
> cfm
Confusion Matrix and Statistics
Reference
Prediction acc good unacc vgood
acc 70 0 10 2
good 5 12 1 0
unacc 1 0 231 0
vgood 0 1 0 11
Overall Statistics
Accuracy : 0.9419
95% CI : (0.9116, 0.9641)
No Information Rate : 0.7035
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.8762
Mcnemar's Test P-Value : NA
Statistics by Class:
Class: acc Class: good Class: unacc Class: vgood
Sensitivity 0.9211 0.92308 0.9545 0.84615
Specificity 0.9552 0.98187 0.9902 0.99698
Pos Pred Value 0.8537 0.66667 0.9957 0.91667
Neg Pred Value 0.9771 0.99693 0.9018 0.99398
Prevalence 0.2209 0.03779 0.7035 0.03779
Detection Rate 0.2035 0.03488 0.6715 0.03198
Detection Prevalence 0.2384 0.05233 0.6744 0.03488
Balanced Accuracy 0.9381 0.95248 0.9724 0.92157
您不一定需要完全按照此处的示例进行代码编写。我建议查看caret
软件包和rpart
的文档以增强代码。或者,您可以提供一个完全可复制的示例。