Question

我正在使用numpy做线性代数。我想做快速子集索引 dot和其他线性操作。

在处理大矩阵时，像A[:,subset].dot(x[subset])这样的解决方案可能比在整个矩阵上进行乘法更长。

A = np.random.randn(1000,10000)
x = np.random.randn(10000,1)
subset = np.sort(np.random.randint(0,10000,500))

Timings表明，当列在一个块中时，子索引可以更快。

%timeit A.dot(x)
100 loops, best of 3: 4.19 ms per loop

%timeit A[:,subset].dot(x[subset])
100 loops, best of 3: 7.36 ms per loop

%timeit A[:,:500].dot(x[:500])
1000 loops, best of 3: 1.75 ms per loop

加速度不是我所期望的（快20倍！）。

有没有人知道一个库/模块的想法，允许通过numpy或scipy这种快速操作？

目前我正在使用cython通过cblas库编写快速列索引点产品。但是对于更复杂的操作（伪逆或子索引最小二乘求解），我并不希望达到良好的加速度。

谢谢！

Answer 1

嗯，这更快。

%timeit A.dot(x)
#4.67 ms

%%timeit
y = numpy.zeros_like(x)
y[subset]=x[subset]
d = A.dot(y)
#4.77ms

%timeit c = A[:,subset].dot(x[subset])
#7.21ms

你有all(d-ravel(c)==0) == True。

请注意，这有多快取决于输入。使用subset = array([1,2,3])，我的解决方案的时间几乎相同，而最后一个解决方案的时间是46micro seconds。

如果subset的尺寸不小于x

的尺寸，基本上会更快

numpy / scipy的快速索引点积

1 个答案: