我编写了一个简单的函数,我使用scikit-learn
中的average_precision_score来计算平均精度。
我的代码:
def compute_average_precision(predictions, gold):
gold_predictions = np.zeros(predictions.size, dtype=np.int)
for idx in range(gold):
gold_predictions[idx] = 1
return average_precision_score(predictions, gold_predictions)
执行该功能时,会产生以下错误。
Traceback (most recent call last):
File "test.py", line 91, in <module>
total_avg_precision += compute_average_precision(np.asarray(probs), len(gold_candidates))
File "test.py", line 29, in compute_average_precision
return average_precision_score(predictions, gold_predictions)
File "/if5/wua4nw/anaconda3/lib/python3.5/site-packages/sklearn/metrics/ranking.py", line 184, in average_precision_score
average, sample_weight=sample_weight)
File "/if5/wua4nw/anaconda3/lib/python3.5/site-packages/sklearn/metrics/base.py", line 81, in _average_binary_score
raise ValueError("{0} format is not supported".format(y_type))
ValueError: continuous format is not supported
如果我打印两个numpy数组predictions
和gold_predictions
,比如说一个例子,它看起来没问题。 [下面提供了一个例子。]
[ 0.40865014 0.26047812 0.07588802 0.26604077 0.10586583 0.17118802
0.26797949 0.34618672 0.33659923 0.22075308 0.42288553 0.24908153
0.26506338 0.28224747 0.32942101 0.19986877 0.39831917 0.23635269
0.34715138 0.39831917 0.23635269 0.35822859 0.12110706]
[1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
我在这里做错了什么?错误是什么意思?
答案 0 :(得分:6)
只需查看sklearn
docs
参数:
y_true:array,shape = [n_samples]或[n_samples,n_classes] True 二进制标签指示符中的二进制标签。
y_score:array,shape = [n_samples]或[n_samples,n_classes]目标 分数,可以是正类的概率估计, 置信度值,或非阈值度量决策(如 在某些分类器上由“decision_function”返回。)
所以你的第一个参数必须是二进制标签数组,但是你传递某种float数组作为第一个参数。所以我相信你需要改变你传递的参数的顺序。