我想加速以下双 for
循环:
n = 10000
A = np.zeros((n, n))
twopi = 2.0 * np.pi
for i in range(0, n):
for j in range(0, n):
A[i,j] = twopi * i*j/n
我该怎么做?我的想法:计算一个向量 v
包含从 0
到 n-1
的所有元素:
v = np.arange(n)
然后构造两个矩阵I
和J
,一个矩阵的所有行都等于v
,另一个矩阵的所有列都等于v
。如果我没记错的话,我可以用
for
循环
A = twopi * I*J/n
这是正确的吗?我如何构建 I
和 J
?有没有更好的方法?
答案 0 :(得分:3)
像这样:
from numpy import outer, pi, arange
n = 10
i = arange(n)
A = 2 * pi / n * outer(i, i)
outer
产品满足您的要求。
编辑: 测试性能的简单方法如下:
from time import time
n = 10000
t = time()
for _ in range(10):
i = arange(n)
A = 2 * pi / n * outer(i, i)
t = time() - t
print("Time used [s]", t)
对我来说,这 10 次重复使用了 5 秒,range
和 arange
之间没有显着差异。
答案 1 :(得分:1)
您可以使用列表推导式:
A = np.array([
[twopi*i*j/n for j in range(n)]
for i in range(n)])
默认情况下 range
从零开始,因此 range(0,n)
而不是 range(n)
是多余的。
答案 2 :(得分:1)
In [118]: matrix = np.array([np.arange(n) for _ in range(n)])
In [119]: matrix
Out[119]:
array([[ 0, 1, 2, ..., 9997, 9998, 9999],
[ 0, 1, 2, ..., 9997, 9998, 9999],
[ 0, 1, 2, ..., 9997, 9998, 9999],
...,
[ 0, 1, 2, ..., 9997, 9998, 9999],
[ 0, 1, 2, ..., 9997, 9998, 9999],
[ 0, 1, 2, ..., 9997, 9998, 9999]])
In [120]: matrix [0,1]
Out[120]: 1
In [121]: for j in range(n): matrix[:,j] *= j
In [122]: matrix
Out[122]:
array([[ 0, 1, 4, ..., 99940009, 99960004, 99980001],
[ 0, 1, 4, ..., 99940009, 99960004, 99980001],
[ 0, 1, 4, ..., 99940009, 99960004, 99980001],
...,
[ 0, 1, 4, ..., 99940009, 99960004, 99980001],
[ 0, 1, 4, ..., 99940009, 99960004, 99980001],
[ 0, 1, 4, ..., 99940009, 99960004, 99980001]])
In [123]: matrix = matrix/n
In [124]: matrix = matrix * np.pi*2
In [125]: matrix
Out[125]:
array([[0.00000000e+00, 6.28318531e-04, 2.51327412e-03, ...,
6.27941596e+04, 6.28067228e+04, 6.28192873e+04],
[0.00000000e+00, 6.28318531e-04, 2.51327412e-03, ...,
6.27941596e+04, 6.28067228e+04, 6.28192873e+04],
[0.00000000e+00, 6.28318531e-04, 2.51327412e-03, ...,
6.27941596e+04, 6.28067228e+04, 6.28192873e+04],
...,
[0.00000000e+00, 6.28318531e-04, 2.51327412e-03, ...,
6.27941596e+04, 6.28067228e+04, 6.28192873e+04],
[0.00000000e+00, 6.28318531e-04, 2.51327412e-03, ...,
6.27941596e+04, 6.28067228e+04, 6.28192873e+04],
[0.00000000e+00, 6.28318531e-04, 2.51327412e-03, ...,
6.27941596e+04, 6.28067228e+04, 6.28192873e+04]])
与脚本相同:
matrix = np.array([np.arange(n) for _ in range(n)])
for j in range(n): matrix[:,j] *= j
matrix = matrix/n
matrix = matrix * np.pi*2
答案 3 :(得分:1)
你确实可以使用numpy
的广播能力。
import numpy as np
n = 10000
twopi = 2.0 * np.pi / n
A = np.arange(n) * np.arange(n).reshape(-1, 1) * twopi
第二个np.arange()
的重塑推动了产品的传播。
即使从时间性能的角度来看,这与将 np.outer()
调用为 Dr.V did 非常相似。
配置:
import numpy as np
n = 10000
twopi = 2 * np.pi / n
次数:
# List comprehension
A = np.array([
[twopi*i*j for j in range(n)]
for i in range(n)])
# timeit > 10.2 s ± 129 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
# Outer
i = np.arange(n)
A = twopi * np.outer(i, i)
# timeit > 234 ms ± 6.87 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
# Product
A = np.arange(n) * np.arange(n).reshape(-1, 1) * twopi
# timeit > 208 ms ± 41.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
如前所述,outer 和product 方法非常相似。 但是列表推导式并不是因为它违背了 numpy 的目的,即使用分布式子任务(以及通常的低级优化)对数组计算进行向量化。