Question

我有两列，预测和地面真理。我想使用numpy或pandas作为系列获得真实阳性的计数。

例如，我的数据是：

Prediction GroundTruth
True       True
True       False
True       True
False      True
False      False
True       True

我想要一个应该具有以下输出的列表：

tp_list = [1,1,2,2,2,3]

在numpy或pandas中是否有一种单线方式？

当前，这是我的解决方案：

tp = 0
for p, g in zip(data.Prediction, data.GroundTruth):
  if p and g: # TP case
    tp = tp + 1
  tp_list.append(tp)

Answer 1

要获得真阳性的连续计数（即累计和），即#sub-header-content{ background-color:purple; height:40px; // WANT TO REMOVE THIS } #sub-header-menu{ position:fixed; height:40px; width: 50px; margin: auto; background-color:green; }，当且仅当Prediction == True时，解决方案是对@RafaelC的修改答案：

GroundTruth == True

Answer 2

如果您想知道实际预测的True是多少True，请使用

(df['Prediction'] & df['GroundTruth']).cumsum()

0    1
1    1
2    2
3    2
4    2
5    3
dtype: int64

（感谢@Peter Leimbigiler的参与）

如果您想知道正确预测了多少，只需比较并使用cumsum

(df['Prediction'] == df['GroundTruth']).cumsum()

输出

0    1
1    1
2    2
3    2
4    3
5    4
dtype: int64

始终可以使用.tolist()

获取列表

(df4['Prediction'] == df4['GroundTruth']).cumsum().tolist()

[1, 1, 2, 2, 3, 4]

Answer 3

也许您可以使用all

df.all(1).cumsum().tolist()
Out[156]: [1, 1, 2, 2, 2, 3]

numpy解决方案

np.cumsum(np.all(df.values,1))
Out[159]: array([1, 1, 2, 2, 2, 3], dtype=int32)

如何使用pandas或numpy计算真实阳性的发生率？

3 个答案: