Question

是否有内置的方法可以分别获得每个班级的准确度分数？我知道在sklearn中，我们可以使用metric.accuracy_score来获得整体准确性。有没有办法获得个别班级的准确度分数？与metrics.classification_report类似的东西。

from sklearn.metrics import classification_report
from sklearn.metrics import accuracy_score

y_true = [0, 1, 2, 2, 2]
y_pred = [0, 0, 2, 2, 1]
target_names = ['class 0', 'class 1', 'class 2']

classification_report没有给出准确度分数：

print(classification_report(y_true, y_pred, target_names=target_names, digits=4))

Out[9]:         precision    recall  f1-score   support

class 0     0.5000    1.0000    0.6667         1
class 1     0.0000    0.0000    0.0000         1
class 2     1.0000    0.6667    0.8000         3

avg / total     0.7000    0.6000    0.6133         5

准确度分数仅给出整体准确度：

accuracy_score(y_true, y_pred)
Out[10]: 0.59999999999999998

Answer 1

您可以使用sklearn的confusion matrix获取准确性

from sklearn.metrics import confusion_matrix
import numpy as np

y_true = [0, 1, 2, 2, 2]
y_pred = [0, 0, 2, 2, 1]
target_names = ['class 0', 'class 1', 'class 2']

#Get the confusion matrix
cm = confusion_matrix(y_true, y_pred)
#array([[1, 0, 0],
#   [1, 0, 0],
#   [0, 1, 2]])

#Now the normalize the diagonal entries
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
#array([[1.        , 0.        , 0.        ],
#      [1.        , 0.        , 0.        ],
#      [0.        , 0.33333333, 0.66666667]])

#The diagonal entries are the accuracies of each class
cm.diagonal()
#array([1.        , 0.        , 0.66666667])

参考

plot Confusion matrix sklearn

Answer 2

您可以自己编码：准确度只不过是分类良好的样本（真阳性和真阴性）与您拥有的样本总数之间的比率。

然后，对于给定的课程，您只考虑所有课程，而不是考虑所有课程。

然后你可以试试这个：让我们首先定义一个方便的功能。

def indices(l, val):
   retval = []
   last = 0
   while val in l[last:]:
           i = l[last:].index(val)
           retval.append(last + i)
           last += i + 1   
   return retval

上述函数将返回某个值 val

列表 l 中的索引

def class_accuracy(y_pred, y_true, class):
    index = indices(l, class)
    y_pred, y_true = ypred[index], y_true[index]
    tp = [1 for k in range(len(y_pred)) if y_true[k]==y_pred[k]]
    tp = np.sum(tp)
    return tp/float(len(y_pred))

最后一个函数将返回您要查找的课堂内精确度。

Answer 3

from sklearn.metrics import confusion_matrix
y_true = [2, 0, 2, 2, 0, 1]
y_pred = [0, 0, 2, 2, 0, 2]
matrix = confusion_matrix(y_true, y_pred)
matrix.diagonal()/matrix.sum(axis=1)

Answer 4

我正在添加我的答案，因为我没有在网上找到这个确切问题的任何答案，并且因为我认为我之前在这里建议的其他计算方法是不正确的。

请记住，准确度定义为：

N/A

或者说成话；它是正确分类的示例（正或负）的数量与测试集中示例总数的比率。

需要注意的重要一点是，对于 TN 和 FN，“否定”是类别不可知的，意思是“未预测为所讨论的特定类别”。例如，请考虑以下内容：

accuracy = (true_positives + true_negatives) / all_samples

在这里，第二个“猫”预测和第二个“狗”预测都是假阴性，因为它们不是“鸟”。

针对您的问题：

据我所知，目前还没有一个包可以提供一种方法来做你正在寻找的东西，但是基于准确率的定义，我们可以使用来自sklearn的混淆矩阵方法来自己计算。

>

y_true = ['cat', 'dog', 'bird', 'bird]
y_pred = ['cat', 'dog', 'cat', 'dog']

最初的问题是不久前发布的，但这可能对像我一样通过 Google 来到这里的任何人有所帮助。

Answer 5

我认为准确性是具有不同维度的通用术语，例如精度，召回率，f1得分（或什至是特异性，敏感性）等，可以从不同的角度提供准确性度量。因此，函数“ classification_report”为每个类别输出一系列精度度量。例如，精度提供了准确检索到的实例（即真阳性）与特定类中可用实例总数（真阳性和假阴性）的比例。

Answer 6

该问题具有误导性。每个类别的准确度分数等于整体准确度分数。考虑混淆矩阵：

from sklearn.metrics import confusion_matrix
import numpy as np

y_true = [0, 1, 2, 2, 2]
y_pred = [0, 0, 2, 2, 1]

#Get the confusion matrix
cm = confusion_matrix(y_true, y_pred)
print(cm)

这给您：

 [[1 0 0]
  [1 0 0]
  [0 1 2]]

准确度的计算方法是正确分类的样本占所有样本的比例：

accuracy = (TP + TN) / (P + N)

关于混淆矩阵，分子（TP + TN）是对角线的总和。分母是所有单元格的总和。每个课程的两者都相同。

Answer 7

您的问题毫无道理。准确性是一种全局度量，没有分类精度。根据实际情况（行）进行归一化的建议会产生称为真实阳性率，敏感度或召回率的内容，具体取决于上下文。同样，如果您通过预测（列）进行归一化，则称为精度或正预测值。

Answer 8

这里的解决方案兄弟：

def classwise_accuracy():
   a = pd.crosstab(y_test,predict_over)
   print(a.max(axis=1)/a.sum(axis=1))
classwise_accuracy()

Scikit-learn，获得每个班级的准确性分数

8 个答案: