从具有未知维数的numpy数组中提取超立方体块

时间:2016-09-05 13:08:55

标签: python numpy multidimensional-array

我有一些python代码,目前使用二维数组进行硬连接,如下所示:

import numpy as np
data = np.random.rand(5, 5)
width = 3

for y in range(0, data.shape[1] - W + 1):
    for x in range(0, data.shape[0] - W + 1):
        block = data[x:x+W, y:y+W]
        # Do something with this block

现在,这是一个二维数组的硬编码,我想将其扩展到3D和4D数组。当然,我可以为其他维度编写更多函数,但我想知道是否有一个python / numpy技巧来生成这些子块而不必为多维数据复制此函数。

1 个答案:

答案 0 :(得分:2)

这是我对这个问题的看法。下面代码背后的想法是找到每个数据片段的“起始索引”。因此,对于5x5x5阵列的4x4x4子阵列,起始索引为(0,0,0), (0,0,1), (0,1,0), (0,1,1), (1,0,0), (1,0,1), (1,1,1),每个维度的切片长度为4.

要获取子数组,只需迭代切片对象的不同元组并将它们传递给数组。

import numpy as np
from itertools import product

def iterslice(data_shape, width):
    # check for invalid width
    assert(all(sh>=width for sh in data_shape), 
           'all axes lengths must be at least equal to width')

    # gather all allowed starting indices for the data shape
    start_indices = [range(sh-width+1) for sh in data_shape]

    # create tuples of all allowed starting indices
    start_coords = product(*start_indices)

    # iterate over tuples of slice objects that have the same dimension
    # as data_shape, to be passed to the vector
    for start_coord in start_coords:
        yield tuple(slice(coord, coord+width) for coord in start_coord)

# create 5x5x5 array
arr = np.arange(0,5**3).reshape(5,5,5)

# create the data slice tuple iterator for 3x3x3 sub-arrays
data_slices = iterslice(arr.shape, 3)

# the sub-arrays are a list of 3x3x3 arrays, in this case
sub_arrays = [arr[ds] for ds in data_slices]