Question

我正在查看此数据集：https://archive.ics.uci.edu/ml/datasets/Credit+Approval。我建了一个ctree：

myFormula<-class~.          # class is a factor of "+" or "-"
ct <- ctree(myFormula, data = train)

现在我想将这些数据放入插入符号的confusionMatrix方法中，以获取与混淆矩阵相关的所有统计数据：

testPred <- predict(ct, newdata = test)

                #### This is where I'm doing something wrong ####
confusionMatrix(table(testPred, test$class),positive="+")
          ####  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ####

$positive
[1] "+"

$table
        td
testPred  -  +
       - 99  6
       + 20 88

$overall
      Accuracy          Kappa  AccuracyLower  AccuracyUpper   AccuracyNull AccuracyPValue  McnemarPValue 
  8.779343e-01   7.562715e-01   8.262795e-01   9.186911e-01   5.586854e-01   6.426168e-24   1.078745e-02 

$byClass
         Sensitivity          Specificity       Pos Pred Value       Neg Pred Value            Precision               Recall                   F1 
           0.9361702            0.8319328            0.8148148            0.9428571            0.8148148            0.9361702            0.8712871 
          Prevalence       Detection Rate Detection Prevalence    Balanced Accuracy 
           0.4413146            0.4131455            0.5070423            0.8840515 

$mode
[1] "sens_spec"

$dots
list()

attr(,"class")
[1] "confusionMatrix"

所以感性是：

（来自插入符号confusionMatrix doc）

如果你采取我的混淆矩阵：

$table
        td
testPred  -  +
       - 99  6
       + 20 88

您可以看到这不会加起来：Sensetivity = 99/(99+20) = 99/119 = 0.831928。在我的confusionMatrix结果中，该值是针对特异性的。但特异性为Specificity = D/(B+D) = 88/(88+6) = 88/94 = 0.9361702，灵敏度的值。

我已经尝试了这个confusionMatrix(td,testPred, positive="+")，但结果更加逼真。我做错了什么？

更新：我也意识到我的混淆矩阵与插入符号的不同之处在于：

   Mine:               Caret:

            td             testPred
   testPred  -  +      td   -  +
          - 99  6        - 99 20
          + 20 88        +  6 88

如您所见，它认为我的假阳性和假阴性是倒退的。

Answer 1

UPDATE ：我发现发送数据要好得多，而不是将表作为参数。来自confusionMatrix文档：

参考
要用作真实结果的类的因子

我认为这意味着什么符号构成了积极的结果。就我而言，这可能是+。但是，参考＆＃39;指的是数据集的实际结果，也就是因变量。

所以我应该使用confusionMatrix(testPred, test$class)。如果由于某种原因您的数据出现故障，它会将其转换为正确的顺序（因此正面和负面结果/预测会在混淆矩阵中正确对齐。

但是，如果您担心结果是正确的因素，请安装plyr库，并使用revalue更改因子：

install.packages("plyr")
library(plyr)
newDF <- df
newDF$class <- revalue(newDF$class,c("+"=1,"-"=0))
# You'd have to rerun your model using newDF

我不确定为什么会这样，但我刚刚删除了正面参数：

confusionMatrix(table(testPred, test$class))

我的困惑矩阵：

        td
testPred  -  +
       - 99  6
       + 20 88

Caret的混乱矩阵：

        td
testPred  -  +
       - 99  6
       + 20 88

虽然现在它说$positive: "-"所以我不确定这是好还是坏。

如何将混淆矩阵发送到插入符号的混淆矩阵？

1 个答案: