Question

我有一个numpy数组，一个定义数组范围的开始/结束索引列表，以及一个值列表，其中值的数量与范围的数量相同。在循环中执行此赋值当前非常慢，因此我想以矢量化方式将值分配给数组中的相应范围。这可能吗？

这是一个具体的，简化的例子：

a = np.zeros([10])

这是开始列表和定义a范围内的结束索引列表，如下所示：

starts = [0, 2, 4, 6]
ends = [2, 4, 6, 8]

这是我想为每个范围分配的值列表：

values = [1, 2, 3, 4]

我有两个问题。第一个是我无法弄清楚如何使用多个切片同时索引到数组，因为范围列表是在实际代码中动态构造的。一旦我能够提取范围，我不确定如何一次分配多个值 - 每个范围一个值。

以下是我尝试创建切片列表以及使用该列表索引数组时遇到的问题：

slices = [slice(start, end) for start, end in zip(starts, ends)]


In [97]: a[slices]
...
IndexError: too many indices for array

In [98]: a[np.r_[slices]]
...
IndexError: arrays used as indices must be of integer (or boolean) type

如果我使用静态列表，我可以一次提取多个切片，但然后分配不能按我想要的方式工作：

In [106]: a[np.r_[0:2, 2:4, 4:6, 6:8]] = [1, 2, 3]
/usr/local/bin/ipython:1: DeprecationWarning: assignment will raise an error in the future, most likely because your index result shape does not match the value array shape. You can use `arr.flat[index] = values` to keep the old behaviour.
  #!/usr/local/opt/python/bin/python2.7

In [107]: a
Out[107]: array([ 1.,  2.,  3.,  1.,  2.,  3.,  1.,  2.,  0.,  0.])

我真正想要的是：

np.array（[1。，1.，2.，2.，3.，3.，4.，4.，0.，0。]）

Answer 1

这将以完全矢量化的方式完成：

counts = ends - starts
idx = np.ones(counts.sum(), dtype=np.int)
idx[np.cumsum(counts)[:-1]] -= counts[:-1]
idx = np.cumsum(idx) - 1 + np.repeat(starts, counts)

a[idx] = np.repeat(values, count)

Answer 2

一种可能性是使用值压缩开始，结束索引并手动广播索引和值：

starts = [0, 2, 4, 6]
ends = [2, 4, 6, 8]
values = [1, 2, 3, 4]
a = np.zeros(10)

import numpy as np
# calculate the index array and value array by zipping the starts, ends and values and expand it
idx, val = zip(*[(list(range(s, e)), [v] * (e-s)) for s, e, v in zip(starts, ends, values)])

# assign values
a[np.array(idx).flatten()] = np.array(val).flatten()

a
# array([ 1.,  1.,  2.,  2.,  3.,  3.,  4.,  4.,  0.,  0.])

或者写一个for循环来为另一个范围分配值：

for s, e, v in zip(starts, ends, values):
    a[slice(s, e)] = v

a
# array([ 1.,  1.,  2.,  2.,  3.,  3.,  4.,  4.,  0.,  0.])

一次为多个numpy数组切片分配多个值

2 个答案: