Question

我需要用1填充0填充的二维数组的N个矩形区域。要填充的区域存储在Nx4 numpy数组中，其中每行包含矩形边界（x_low，x_high，y_low，y_high）。这部分目前是我目前工作中最慢的部分，我想知道它是否可以更快地完成。

目前，这是通过简单地遍历区域数组来完成的，目标数组使用切片填充了它们：

import numpy as np
def fill_array_with_ones(coordinates_array, target_array):
    for row in coordinates_array:
        target_array[row[0]:row[1], row[2]:row[3]] = 1

coords = np.array([[1,3,1,3], [3,5,3,5]])
target = np.zeros((5,5))
fill_array_with_ones(coords, target)
print(target)

输出：

array([[0., 0., 0., 0., 0.],
       [0., 1., 1., 0., 0.],
       [0., 1., 1., 0., 0.],
       [0., 0., 0., 1., 1.],
       [0., 0., 0., 1., 1.]])

我期望有一些numpy的魔术可以使我以矢量化的方式进行操作，从而避免了在行上进行迭代，并希望可以加快执行速度：

target[bounds_to_slices(coords)] = 1

Answer 1

我对注释中提到的方法和for循环方法进行了一些测试。我想知道您的瓶颈真的是这个吗？

# prepare data
import numpy as np
a = np.zeros((1024, 1024), '?')
bound = np.random.randint(0, 1024, (9999,4), 'H')

# build indices, as indices can be pre-computed, don't time it
x = np.arange(1024, dtype='H')[:,None]
y = x[:,None]

# the force vectorized method
%%timeit
ymask = y >= bound[:,0]
ymask &= y < bound[:,1]
xmask = x >= bound[:,2]
xmask &= x < bound[:,3]
a[(ymask & xmask).any(2)] = True
# outputs 3.06 s ± 1.35 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

# the normal method
%%timeit
for i,j,k,l in bound:
    a[i:k,j:l] = True
# outputs 22.8 ms ± 28.5 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

无论有多少个边界框，不仅“矢量化”方法总是较慢，它还会在此处生成10GB的临时数组。另一方面，普通方法相当快。

如何以向量化的方式将矩形边界转换为数组切片？

1 个答案: