Question

我有一个空数组：

empty = np.array([0, 0, 0, 0, 0])

与我的数组中的位置对应的索引数组

ind = np.array([2, 3, 1, 2, 4, 2, 4, 2, 1, 1, 1, 2])

和一组值

val = np.array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

我想在＆＃39; val＆＃39;中添加值进入＆＃39;空＆＃39;根据＆＃39; ind＆＃39;

给出的立场

非矢量化解决方案是：

for i, v in zip(ind, val): maps[i] += v
>>> maps
[ 0.  4.  5.  1.  2.]

我的实际数组是多维的，因此我有一个需要速度我真的想要一个矢量化解决方案，或者一个非常快的解决方案。

请注意，这不起作用：

maps[ind] += val
>>> maps
array([ 0.,  1.,  1.,  1.,  1.])

我非常感谢在python 2.7,3.5,3.6中运行的解决方案，没有打嗝

Answer 1

您可以使用等同于empty[ind] += val的{{3}}，除了为多次索引的元素累积结果，为这些索引提供累积结果。

>>> np.add.at(empty, ind, val)
>>> empty
array([0, 4, 5, 1, 2])

Answer 2

您要找的是e=np.bincount(ind, weights=val, minlength=n)，其中n是空数组的长度。这样您就不必初始化empty。您只需要第一次执行此操作，之后您可以执行e+=np.bincount(ind, weights=val)

这至少是np.add.at的两倍：

%timeit np.bincount(ind, val, minlength=empty.size)
The slowest run took 12.69 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.05 µs per loop

%timeit np.add.at(empty, ind, val)
The slowest run took 2822.05 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 4.32 µs per loop

对于多维索引，您可以这样做：

np.bincount(np.ravel_multi_index(ind, empty.shape), np.ravel(val), minlength=empty.size).reshape(empty.shape)

我不确定如何使用np.add.at来比较速度

Answer 3

这基本上是histogram，所以在一维情况下：

h, b = np.histogram(ind, bins=np.arange(empty.size+1), weights=val)
empty += h

当然，如果空的只有零，你可以省略第二个语句。

根据第二阵列的索引的矢量化的数组和

3 个答案: