Question

我只是想知道这是否是计算分类准确度的合法方法：

获取精确召回阈值
对每个阈值二值化连续y_scores
从列联表（混淆矩阵）计算其准确度

返回阈值的平均准确度

recall, precision, thresholds = precision_recall_curve(np.array(np_y_true), np.array(np_y_scores))
accuracy = 0
for threshold in thresholds:
    contingency_table = confusion_matrix(np_y_true, binarize(np_y_scores, threshold=threshold)[0])
    accuracy += (float(contingency_table[0][0]) + float(contingency_table[1][1]))/float(np.sum(contingency_table))

print "Classification accuracy is: {}".format(accuracy/len(thresholds))

Answer 1

你正朝着正确的方向前进。混淆矩阵绝对是计算分类器准确性的正确开始。在我看来，你的目标是接收器的操作特性。

在统计学中，接收器操作特性（ROC）或ROC曲线是图形图，其示出了二元分类器系统的性能，因为其辨别阈值是变化的。 http://www.eclipse.org/downloads/packages/eclipse-ide-java-ee-developers/keplersr2

AUC（曲线下面积）是分类器性能的衡量标准。更多信息和解释可以在这里找到：

https://en.wikipedia.org/wiki/Receiver_operating_characteristic

https://stats.stackexchange.com/questions/132777/what-does-auc-stand-for-and-what-is-it

这是我的实施，欢迎您改进/评论：

Accounts.validateLoginAttempt(function(attempt) {

    if( ! attempt.user || ! attempt.user.profile.status.isActive){
        return false;
    } else {
        return true;
    }
});

召回后的分类精度和精度

1 个答案: