Question

我的MaskedArray a形状（L，M，N），我想将未屏蔽的元素转移到普通数组b（形状相同），这样，沿着最后一个维度，第一个元素接收非屏蔽值，其余元素为零。例如，在2D中：

a = [[--,  1,  2, --,  7, --,  5],
     [3 , --, --,  2, --, --, --]]

# Transfer to:

b = [[1, 2, 7, 5, 0, 0, 0],
     [3, 2, 0, 0, 0, 0, 0]]

最简单的方法是通过for循环，例如，

for idx in np.ndindex(np.shape(a)[:-1]):
   num = a[idx].count()
   b[idx][:num] = a[idx].compressed()
   # or perhaps,
   # b[idx][:num] = a[idx][~a[idx].mask]

但是对于大型数组来说这将是非常慢的（实际上，我有许多具有相同掩码值的不同阵列，所有这些我都希望以相同的方式进行转换）。 是否有一种奇特的切片方式来做到这一点？

编辑：这是构建适当的索引元组以分配值的一种方法，但它看起来很难看。也许还有更好的东西？

b = np.zeros(x.shape)
# Construct a list with a list for each dimension.
left = [[] for ii in range(a.ndim)]
# In each sub-list, construct the indices to `b` to store each value from `a`
for idx in np.ndindex(a.shape[:-1]):
    num = a[idx].count()
    # here `ii` is the dimension number, and jj the index in that dimension
    for ii, jj in enumerate(idx):
        left[ii] = left[ii] + num*[jj]
        right[ii] = right[ii] + num*[jj]
    # The last dimension is just consecutive numbers for as many values
    left[-1] = left[-1] + list(range(num))

a[left] = b[~b.mask]

Answer 1

从链接的“pad with 0s”问题调整@divakar的答案，

Convert Python sequence to NumPy array, filling missing values

In [464]: a=np.array([[0,1,2,0,7,0,5],[3,0,0,2,0,0,0]])
In [465]: Ma = np.ma.masked_equal(a, 0)
In [466]: Ma
Out[466]: 
masked_array(data =
 [[-- 1 2 -- 7 -- 5]
 [3 -- -- 2 -- -- --]],
             mask =
 [[ True False False  True False  True False]
 [False  True  True False  True  True  True]],
       fill_value = 0)

获取我们需要填充的0的数量在这里很简单 - 只需将面具加上真值

In [467]: cnt=Ma.mask.sum(axis=1)  # also np.ma.count_masked(Ma,1)
In [468]: cnt
Out[468]: array([3, 5])
In [469]: 
In [469]: mask=(7-cnt[:,None])>np.arange(7) # key non intuitive step
In [470]: mask
Out[470]: 
array([[ True,  True,  True,  True, False, False, False],
       [ True,  True, False, False, False, False, False]], dtype=bool)

构造mask使得第一个元素cnt元素（沿着每个dim-0轴）为True，其余为False。

现在只需使用此掩码将compressed值复制到空白数组：

In [471]: M=np.zeros((2,7),int)
In [472]: M[mask]=Ma.compressed()
In [473]: M
Out[473]: 
array([[1, 2, 7, 5, 0, 0, 0],
       [3, 2, 0, 0, 0, 0, 0]])

我不得不使用cnt和np.arange(7)来调整以获得所需的True / False值组合（左对齐的Trues）。

每行计算未屏蔽的值：

In [486]: np.ma.count(Ma,1)
Out[486]: array([4, 2])

将此概括为N维：

def compress_masked_array(vals, axis=-1, fill=0.0):
    cnt = vals.mask.sum(axis=axis)
    shp = vals.shape
    num = shp[axis]
    mask = (num - cnt[..., np.newaxis]) > np.arange(num)
    n = fill * np.ones(shp)
    n[mask] = vals.compressed()
    return n

将未屏蔽的元素从maskedarray传输到常规数组

1 个答案: