我有一个索引矩阵,用于定义看起来像这样的区域:
0 0 0 0 1 1 1
0 0 0 1 1 1 1
0 0 1 1 1 1 2
0 1 1 1 1 1 2
2 2 2 2 2 2 2
3 3 3 3 3 3 3
我有另一个相同大小的矩阵和重量。我希望有效地对每个区域进行加权求和。这是我的第一次尝试:
n = indices.max() + 1
xSum, ySum, dSum = np.zeros(n), np.zeros(n), np.zeros(n)
for j in range(weights.shape[1]):
for i in range(weights.shape[0]):
ind = indices[i, j]
density = weights[i, j]
xSum[ind] += density * i
ySum[ind] += density * j
dSum[ind] += density
x, y = xSum / dSum, ySum / dSum
显然,Python中的本机循环不是很快。 我的第二次尝试尝试使用屏蔽:
x, y = [], []
row_matrix = np.fromfunction(lambda i, j: i, weights.shape)
col_matrix = np.fromfunction(lambda i, j: j, weights.shape)
for ind in range(num_regions):
mask = (indices == ind)
xSum = sum(weights[mask] * row_matrix[mask])
ySum = sum(weights[mask] * col_matrix[mask])
dSum = sum(weights[mask])
x.append(xSum / dSum)
y.append(ySum / dSum)
问题是,我能更快地完成这项工作吗?没有循环,纯粹在矩阵上?
对于测试,您可以生成如下随机大矩阵:
indices = np.random.randint(0, 100, (1000, 1000))
weights = np.random.rand(1000, 1000)
在这个数据集上,第一个需要1.8秒,后者需要0.9秒。
答案 0 :(得分:3)
使用np.bincount
:
import numpy as np
indices = np.random.randint(0, 100, (1000, 1000))
weights = np.random.rand(1000, 1000)
def orig(indices, weights):
x, y = [], []
row_matrix = np.fromfunction(lambda i, j: i, weights.shape)
col_matrix = np.fromfunction(lambda i, j: j, weights.shape)
num_regions = indices.max()+1
for ind in range(num_regions):
mask = (indices == ind)
xSum = sum(weights[mask] * row_matrix[mask])
ySum = sum(weights[mask] * col_matrix[mask])
dSum = sum(weights[mask])
x.append(xSum / dSum)
y.append(ySum / dSum)
return x, y
def alt(indices, weights):
indices = indices.ravel()
h, w = weights.shape
row_matrix, col_matrix = np.ogrid[:h, :w]
dSum = np.bincount(indices, weights=weights.ravel())
xSum = np.bincount(indices, weights=(weights*row_matrix).ravel())
ySum = np.bincount(indices, weights=(weights*col_matrix).ravel())
return xSum/dSum, ySum/dSum
expected_x, expected_y = orig(indices, weights)
result_x, result_y = alt(indices, weights)
# check that the result is the same
assert np.allclose(expected_x, result_x)
assert np.allclose(expected_y, result_y)
这是一个基准:
In [163]: %timeit orig(indices, weights)
1 loops, best of 3: 966 ms per loop
In [164]: %timeit alt(indices, weights)
10 loops, best of 3: 20.8 ms per loop