我使用LIBSVM对256个类进行分类。我的数据集大约是5000-10000。对于SVM,我使用一个策略来训练我的模型。现在,我得到低精度(15%~30%)但高AUC(> 90%)的结果。如果相应的预测模型的Acc低(13-30%),我认为不能获得高AUC(0.9和更高)? 我指的是开源Python库(scikit-learn)来计算多种问题的AUC。 (http://scikit-learn.org/stable/auto_examples/model_selection/plot_roc.html#sphx-glr-auto-examples-model-selection-plot-roc-py) 这用于计算AUC:
# compute ROC curve and ROC area for each class
fpr = dict()
tpr = dict()
roc_auc = dict()
# test_label_kernel: the true label of one insance
# LensOfLabel : the number of all classes
y = label_binarize( test_label_kernel, classes = list(range(0,LensOfLabel,1)) )
#sort_pval: the prediction probability of SVM
for i in range(LensOfLabel):
fpr[i], tpr[i], _ = metrics.roc_curve( y[:,i], sort_pval[:,i] )
roc_auc[i] = metrics.auc( fpr[i], tpr[i] )
# First aggregate all false positive rates
n_classes = LensOfLabel
all_fpr = np.unique(np.concatenate([fpr[i] for i in range(n_classes)]))
# Then interpolate all ROC curves at this points
mean_tpr = np.zeros_like(all_fpr)
for i in range(n_classes):
mean_tpr += interp(all_fpr, fpr[i], tpr[i])
# Finally average it and compute AUC
mean_tpr /= n_classes
fpr["macro"] = all_fpr
tpr["macro"] = mean_tpr
roc_auc["macro"] = metrics.auc(fpr["macro"], tpr["macro"])
print( ("macroAUC: %.4f") %roc_auc["macro"] )
#compute micro-average ROC curve and ROC area
fpr["micro"], tpr["micro"], _ = metrics.roc_curve( y.ravel(), sort_pval.ravel() )
roc_auc["micro"] = metrics.auc( fpr["micro"], tpr["micro"] )
print( ("microAUC: %.4f") %roc_auc["micro"] )
ROC曲线是;