Question

我有一个2D数组（对于这个例子，实际上可以是ND），我想为其创建一个遮盖每行末尾的遮罩。例如：

np.random.seed(0xBEEF)
a = np.random.randint(10, size=(5, 6))
mask_indices = np.argmax(a, axis=1)

我想将mask_indices转换为布尔型掩码。目前，我想不出一种比

更好的方法

mask = np.zeros(a.shape, dtype=np.bool)
for r, m in enumerate(mask_indices):
    mask[r, m:] = True

所以

a = np.array([[6, 5, 0, 2, 1, 2],
              [8, 1, 3, 7, 1, 9],
              [8, 7, 6, 7, 3, 6],
              [2, 7, 0, 3, 1, 7],
              [5, 4, 0, 7, 6, 0]])

和

mask_indices = np.array([0, 5, 0, 1, 3])

我想看

mask = np.array([[ True,  True,  True,  True,  True,  True],
                 [False, False, False, False, False,  True],
                 [ True,  True,  True,  True,  True,  True],
                 [False,  True,  True,  True,  True,  True],
                 [False, False, False,  True,  True,  True]])

此操作是否有矢量化形式？

通常，除了定义索引点的维度外，我还希望能够在所有维度上做到这一点。

Answer 1

I。沿最后一个轴（行）的Ndim数组蒙版

要使n-dim数组沿行屏蔽，我们可以-

def mask_from_start_indices(a, mask_indices):
    r = np.arange(a.shape[-1])
    return mask_indices[...,None]<=r

样品运行-

In [177]: np.random.seed(0)
     ...: a = np.random.randint(10, size=(2, 2, 5))
     ...: mask_indices = np.argmax(a, axis=-1)

In [178]: a
Out[178]: 
array([[[5, 0, 3, 3, 7],
        [9, 3, 5, 2, 4]],

       [[7, 6, 8, 8, 1],
        [6, 7, 7, 8, 1]]])

In [179]: mask_indices
Out[179]: 
array([[4, 0],
       [2, 3]])

In [180]: mask_from_start_indices(a, mask_indices)
Out[180]: 
array([[[False, False, False, False,  True],
        [ True,  True,  True,  True,  True]],

       [[False, False,  True,  True,  True],
        [False, False, False,  True,  True]]])

II。沿通用轴的Ndim阵列遮罩

对于沿通用轴遮罩的n维数组，它应该是-

def mask_from_start_indices_genericaxis(a, mask_indices, axis):
    r = np.arange(a.shape[axis]).reshape((-1,)+(1,)*(a.ndim-axis-1))
    mask_indices_nd = mask_indices.reshape(np.insert(mask_indices.shape,axis,1))
    return mask_indices_nd<=r

样品运行-

数据数组设置：

In [288]: np.random.seed(0)
     ...: a = np.random.randint(10, size=(2, 3, 5))

In [289]: a
Out[289]: 
array([[[5, 0, 3, 3, 7],
        [9, 3, 5, 2, 4],
        [7, 6, 8, 8, 1]],

       [[6, 7, 7, 8, 1],
        [5, 9, 8, 9, 4],
        [3, 0, 3, 5, 0]]])

沿axis=1的索引设置和屏蔽-

In [290]: mask_indices = np.argmax(a, axis=1)

In [291]: mask_indices
Out[291]: 
array([[1, 2, 2, 2, 0],
       [0, 1, 1, 1, 1]])

In [292]: mask_from_start_indices_genericaxis(a, mask_indices, axis=1)
Out[292]: 
array([[[False, False, False, False,  True],
        [ True, False, False, False,  True],
        [ True,  True,  True,  True,  True]],

       [[ True, False, False, False, False],
        [ True,  True,  True,  True,  True],
        [ True,  True,  True,  True,  True]]])

沿axis=2的索引设置和屏蔽-

In [293]: mask_indices = np.argmax(a, axis=2)

In [294]: mask_indices
Out[294]: 
array([[4, 0, 2],
       [3, 1, 3]])

In [295]: mask_from_start_indices_genericaxis(a, mask_indices, axis=2)
Out[295]: 
array([[[False, False, False, False,  True],
        [ True,  True,  True,  True,  True],
        [False, False,  True,  True,  True]],

       [[False, False, False,  True,  True],
        [False,  True,  True,  True,  True],
        [False, False, False,  True,  True]]])

其他情况

A。扩展到给定的结束/停止索引以进行掩盖

要扩展解决方案的适用范围，当我们获得用于屏蔽的结束/停止索引（即，我们希望对mask[r, :m] = True进行矢量化处理）时，我们只需要在发布的解决方案中将比较的最后一步编辑为以下内容-

return mask_indices_nd>r

B。输出整数数组

在某些情况下，我们可能希望获取一个int数组。在这些文件上，只需简单地查看输出即可。因此，如果out是发布解决方案的输出，那么我们可以分别对out.view('i1')和out.view('u1') dtype输出分别进行int8或uint8。

对于其他数据类型，我们需要使用.astype()进行dtype转换。

C。用于停止索引的包含索引的掩码

对于包含索引的掩码，即在停止索引的情况下要包括索引，我们需要在比较中简单地包含相等性。因此，最后一步将是-

return mask_indices_nd>=r

D。用于起始索引的索引专有屏蔽

在这种情况下，将给定起始索引，并且这些索引不会被屏蔽，而仅从下一个元素开始直至结束。因此，类似于上一节中列出的推理，对于这种情况，我们将最后一步修改为-

return mask_indices_nd<r

Answer 2

>>> az = np.zeros(a.shape)
>>> az[np.arange(az.shape[0]), mask_indices] = 1
>>> az.cumsum(axis=1).astype(bool)  # use n-th dimension for nd case
array([[ True,  True,  True,  True,  True,  True],
       [False, False, False, False, False,  True],
       [ True,  True,  True,  True,  True,  True],
       [False,  True,  True,  True,  True,  True],
       [False, False, False,  True,  True,  True]])

根据起始索引有效填充面膜

2 个答案:

I。沿最后一个轴（行）的Ndim数组蒙版

II。沿通用轴的Ndim阵列遮罩

其他情况