Question

我的问题与此one有关，在R中使用table()函数生成混淆矩阵。我正在寻找一种不使用包装（例如插入符号）的解决方案。

我们在二进制分类问题中说这些是predictions和labels：

predictions <- c(0.61, 0.36, 0.43, 0.14, 0.38, 0.24, 0.97, 0.89, 0.78, 0.86, 0.15,  0.52, 0.74, 0.24)
labels      <- c(1,    1,    1,    0,    0,     1,    1,    1,    0,     1,    0,    0,    1,    0)

对于这些值，下面的解决方案可以很好地创建2 * 2混淆矩阵，比如说，阈值= 0.5：

# Confusion matrix for threshold = 0.5
conf_matrix <- as.matrix(table(predictions>0.5,labels))
  conf_matrix
     labels
       0 1
 FALSE 4 3
 TRUE  2 5

但是，如果我选择任何小于min(predictions)或大于max(predictions)的值，我就不会得到2 * 2矩阵，因为数据不会有FALSE或者正确发生，例如：

conf_matrix <- as.matrix(table(predictions>0.05,labels))
  conf_matrix
     labels
       0 1
  TRUE 6 8

我需要一种方法，为0到1之间的所有可能阈值（决策边界）一致地生成2 * 2混淆矩阵，因为我将其用作优化中的输入。有没有办法可以调整table函数，所以它总是返回一个2 * 2矩阵？

Answer 1

您可以将阈值预测作为一个因子变量来实现：

(conf_matrix <- as.matrix(table(factor(predictions>0.05, levels=c(F, T)), labels)))
#        labels
#         0 1
#   FALSE 0 0
#   TRUE  6 8

R创建2 * 2混淆矩阵的通用解决方案

1 个答案: