Question

我有一对相同尺寸的二维数组，（n，3）。我想从第一个基于索引的第二个选择。我的想法如下：

data[labels == row]

其中row是长度为3的向量。内部布尔比较给出了一个形状数组（n，3）。索引提供了一个平坦的1d数组。

我的问题是，我必须手动重塑数组，或在数组np.all上使用类似labels == row的内容。

如果data是pandas DataFrame，这实际上可以正常工作。使用纯ndarray s？

执行此操作的正确方法是什么？

Answer 1

使用(labels == row).all(axis=1)选择 all 值匹配的行：

import numpy as np
np.random.seed(2016)

labels = np.random.randint(10, size=(10, 3))
data = np.random.randint(10, size=(10, 3))
# array([[0, 8, 2],
#        [3, 2, 2],
#        [4, 0, 9],
#        [0, 4, 9],
#        [5, 5, 1],
#        [7, 8, 0],
#        [0, 9, 5],
#        [0, 6, 2],
#        [0, 0, 5],
#        [5, 0, 7]])

row = labels[::3] = labels[0]
data[(labels == row).all(axis=1)]

产量

array([[0, 8, 2],
       [0, 4, 9],
       [0, 9, 5],
       [5, 0, 7]])

请注意，布尔数组labels == row具有一些True值在不完整匹配的行上：

In [138]: labels == row
Out[138]: 
array([[ True,  True,  True],
       [ True, False, False],    # <-- a lone True value
       [False,  True, False],    # <--
       [ True,  True,  True],
       [False, False, False],
       [False, False,  True],    # <--
       [ True,  True,  True],
       [False, False, False],
       [False, False, False],
       [ True,  True,  True]], dtype=bool)

因此data[labels == row]会返回一些与完整行匹配无关的值：

In [141]: data[labels == row]
Out[141]: array([0, 8, 2, 3, 0, 0, 4, 9, 0, 0, 9, 5, 5, 0, 7])
                          ^  ^           ^
                          |  |           |
                          not related to a complete row match

基于行比较的二维ndarray的布尔索引

1 个答案: