Question

我对numpy和科学计算相当陌生，我在一个问题上挣扎了好几天，所以我决定把它发布在这里。

我正在尝试计算numpy数组中特定条件的出现。

In [233]: import numpy as np

In [234]: a= np.random.random([5,5])

In [235]: a >.7
Out[235]: array([[False,  True,  True, False, False],
   [ True, False, False, False,  True],
   [ True, False,  True,  True, False],
   [False, False, False, False, False],
   [False, False,  True, False, False]], dtype=bool)

我想计算每行中True的出现次数，并在此次数达到某个阈值时保留行：

前：

results=[]
threshold = 2

for i,row in enumerate(a>.7):
  if len([value for value in row if value==True]) > threshold:
     results.append(i) # keep ids for each row that have more than 'threshold' times True

这是代码的非优化版本，但我很想用numpy实现相同的功能（我有一个非常大的矩阵要处理）。

我一直在用np.where尝试各种各样的事情，但我只能得到扁平的结果。我需要行号

提前致谢！

Answer 1

要使结果可重复，请使用一些种子：

>>> np.random.seed(100)

然后是样本矩阵

>>> a = np.random.random([5,5])

计算沿轴的出现次数和总和：

>>> (a >.7).sum(axis=1)
array([1, 0, 3, 1, 2])

您可以使用np.where获取行号：

>>> np.where((a > .7).sum(axis=1) >= 2)
(array([2, 4]),)

要过滤结果，只需使用布尔索引：

>>> a[(a > .7).sum(axis=1) >= 2]
array([[ 0.89041156,  0.98092086,  0.05994199,  0.89054594,  0.5769015 ],
       [ 0.54468488,  0.76911517,  0.25069523,  0.28589569,  0.85239509]])

Answer 2

您可以使用a.sum对轴进行求和。
然后你可以使用结果向量的位置。

results = np.where(a.sum(axis=0) < threshold))

矩阵中每行的具体情况计数

2 个答案: