numpy矩阵中的条件过滤

时间:2016-02-03 23:29:27

标签: python numpy

我有这样一个矩阵:

[[1,2,3,4,5,6]
 ['a','b','c','d','e']
 [1,2,3,4,5,6]
 [1,2,3,4,5,6]
 [1,2,3,4,5,6]
 [1,1,1,0,0,0]
 [1,2,3,4,5,6]]

我想提出这个问题:pos_data = data[data[:, 5] == 1]

但是我收到了这个错误:

  

IndexError:数组索引太多

我怎样才能做到这一点?

1 个答案:

答案 0 :(得分:1)

您的工作流程中是否还有其他错误?它似乎在我的测试中起作用:

data = np.random.randint(1, 23, (22136, 27))
data.shape
# (22136,27)
res = data[data[..., 5] == 1]
res.shape
# (1001, 27)
res
#array([[21, 10, 18, ..., 10, 12, 20],
#       [ 7, 20, 12, ..., 10, 13,  7],
#       [ 1, 12,  4, ...,  6, 19, 19],
#       ..., 
#       [ 8, 10, 18, ...,  4, 15,  8],
#       [ 1, 13,  4, ..., 22, 13, 21],
#       [11,  3, 18, ..., 18, 10,  5]])

或者给出你的另一个例子:

mat = np.array([[1,2,3,4,5,6],
               [1,2,3,4,5,6],
               [1,2,3,4,5,6],
               [1,1,1,0,0,0],
               [1,2,3,4,5,6],
               [1,2,3,4,5,6]])

mat[mat[:, 2] == 1]
# array([[1, 1, 1, 0, 0, 0]])

或许这不是你想要的?

我还猜你可以通过使用不同的符号来避免这种错误(numpy documentation about indexing/slicing上有一些细节):

In [20]: mat = np.array([[1,2,3,4,5,6], ['a','b','c','d','e'], [1,2,3,4,5,6]])

In [22]: mat[2,:]
Traceback (most recent call last):

  File "<ipython-input-322-142bfc45a932>", line 1, in <module>
    mat[2,:]

IndexError: too many indices for array

In [23]: mat[2,...]
Out[23]: array([1, 2, 3, 4, 5, 6], dtype=object)