Question

我有两个numpy数组，每个数组的形状为（10000,10000）。一个是值数组，另一个是索引数组。

Value=np.random.rand(10000,10000)
Index=np.random.randint(0,1000,(10000,10000))

我想通过对所有＆＃34;值数组＆＃34;进行求和来制作一个列表（或1D numpy数组）。引用＆＃34;索引数组＆＃34;。例如，对于每个索引i，找到匹配的数组索引并将其作为参数赋予值数组

for i in range(1000):
    NewArray[i] = np.sum(Value[np.where(Index==i)])

然而，这太慢了，因为我必须通过300,000个数组进行循环。我试图提出一些像

这样的逻辑索引方法

NewArray[Index] += Value[Index]

但它没有用。我尝试的下一件事是使用字典

for k, v in list(zip(Index.flatten(),Value.flatten())):
    NewDict[k].append(v)

和

for i in NewDict:
    NewDict[i] = np.sum(NewDict[i])

但它也很慢

有没有聪明的方法可以加速？

Answer 1

我有两个想法。首先，尝试屏蔽，它将速度提高大约4倍：

for i in range(1000):
    NewArray[i] = np.sum(Value[Index==i])

或者，您可以对数组进行排序，以将要添加的值放在连续的内存空间中。每次在切片上调用sum时，屏蔽或使用where()必须将所有值聚集在一起。通过前装这次聚会，你可以大大加快速度：

# flatten your arrays
vals = Value.ravel()
inds = Index.ravel()
s = np.argsort(inds)  # these are the indices that will sort your Index array

v_sorted = vals[s].copy()  # the copy here orders the values in memory instead of just providing a view
i_sorted = inds[s].copy()
searches = np.searchsorted(i_sorted, np.arange(0, i_sorted[-1] + 2)) # 1 greater than your max, this gives you your array end...
for i in range(len(searches) -1):
    st = searches[i]
    nd = searches[i+1]
    NewArray[i] = v_sorted[st:nd].sum()

此方法在我的计算机上需要26秒，而使用旧方法需要400秒。祝好运。如果您想了解有关连续内存和性能的更多信息check this discussion out.

用numpy加速花式索引

1 个答案: