Question

我有一个 2n x 2m numpy数组。我想通过在 2 x 2 不重叠的子数组中随机选择一个对我的初始数组进行分区的元素来形成一个 n x m 数组。最好的方法是什么？有没有办法避免两个 for 循环（每个维度一个）？

例如，如果我的数组是

然后，有四个 2 x 2 子数组对其进行分区：

，我想从每个元素中随机选取一个元素以形成新的数组，例如

5 3  ,  6 8  ,  2 3
9 2     9 1     0 0  .

谢谢您的时间。

Answer 1

这可以通过采样完成。而不是对每个2x2正方形进行采样，我们将整个ndarray采样为4个单独的ndarray，这些子数组中的相同索引将指向同一2x2正方形。然后，我们从这4个单独的ndarray中随机采样：

# create test dataset
test = np.arange(36).reshape(6,6)

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35]])

# Create subsamples from ndarray
samples = np.array([test[::2, ::2], test[1::2, 1::2], test[::2, 1::2], test[1::2, ::2]])
>>> samples
array([[[ 0,  2,  4],
        [12, 14, 16],
        [24, 26, 28]],

       [[ 7,  9, 11],
        [19, 21, 23],
        [31, 33, 35]],

       [[ 1,  3,  5],
        [13, 15, 17],
        [25, 27, 29]],

       [[ 6,  8, 10],
        [18, 20, 22],
        [30, 32, 34]]])

现在，这4个子样本中每个样本的相同索引都指向原始ndarray上相同的2x2平方。我们只需要从相同的索引中随机选择：

# Random choice sampling between these 4 subsamples.
select = np.random.randint(4,size=(3,3))
>>> select
array([[2, 2, 1],
       [3, 1, 1],
       [3, 0, 0]])

result = select.choose(samples)
>>> result

array([[ 1,  3, 11],
       [18, 21, 23],
       [30, 26, 28]])

Answer 2

我从另一个answer获得了块状函数。该答案假定原始数组的大小适合该操作。

import numpy as np

def blockshaped(arr, nrows, ncols):
    """
    Return an array of shape (n, nrows, ncols) where
    n * nrows * ncols = arr.size

    If arr is a 2D array, the returned array should look like n subblocks with
    each subblock preserving the "physical" layout of arr.
    """
    h, w = arr.shape
    return (arr.reshape(h//nrows, nrows, -1, ncols)
               .swapaxes(1,2)
               .reshape(-1, nrows, ncols))


arr = np.array([[1,2,3,4],[5,6,7,8],[9,0,1,2],[8,5,7,0]])

#  arr is an 2d array with dimension mxn
m = arr.shape[0]
n = arr.shape[1]

#  define blocksize
block_size = 2

#  divide into sub 2x2 arrays
#  blocks is a (Nx2x2) array
blocks = blockshaped(arr, block_size, block_size)

#  select random elements from each block to form new array
num_blocks = block_size**2
new_arr = blocks[np.arange(num_blocks), np.random.randint(low=0, high=2, size=num_blocks), np.random.randint(low=0, high=2,size=num_blocks)]

print("original array:")
print(arr)

print("random pooled array:")
print(new_arr)

如何通过随机选择2x2子数组中的元素来对2D数组进行下采样？

2 个答案: