假设我有两个1D阵列A,B
A=[1,2,3,4]
B=[5,6,7,8,9]
我想要
C=[[1,5],[1,6],[1,7],[1,8],[1,9],
[2,5],[2,6],[2,7],[2,8],[2,9],
[3,5],[3,6],[3,7],[3,8],[3,9],
[4,5],[4,6],[4,7],[4,8],[4,9]]
我通过创建一个新的数组C并使用循环来插入数组
来尝试这个C=np.empty(shape=(A.shape[0]*B.shape[0],2))
for i in range(A.shape[0]):
C[i*B.shape[0]:(i+1)*B.shape[0],0]=A[i]
for i in range(B.shape[0]):
C[i*A.shape[0]:(i+1)*A.shape[0],1]=B[i]
然而,我有大约50,000个案例要计算| A | * | B | = 100 * 100。有没有其他方式(numpy-thonic或pythonic)我可以提高时间复杂度?
答案 0 :(得分:2)
方法#1
使用np.meshgrid
然后使用np.array
和最终转置 -
np.array(np.meshgrid(A,B)).T.reshape(-1,2)
示例运行 -
In [3]: A
Out[3]: array([1, 2, 3, 4])
In [4]: B
Out[4]: array([5, 6, 7, 8, 9])
In [5]: np.array(np.meshgrid(A,B)).T.reshape(-1,2)
Out[5]:
array([[1, 5],
[1, 6],
[1, 7],
[1, 8],
[1, 9],
[2, 5],
[2, 6],
[2, 7],
[2, 8],
[2, 9],
[3, 5],
[3, 6],
[3, 7],
[3, 8],
[3, 9],
[4, 5],
[4, 6],
[4, 7],
[4, 8],
[4, 9]])
方法#2
基于初始化的方法,重点关注性能,尤其是对于大型数组 -
def initialization_based(A,B):
N = A.size
M = B.size
out = np.empty((N,M,2),dtype=A.dtype)
out[...,0] = A[:,None]
out[...,1] = B
out.shape = (-1,2)
return out
In [7]: A = np.random.randint(0,9,(100))
In [8]: B = np.random.randint(0,9,(100))
In [9]: %timeit np.array(np.meshgrid(A,B)).T.reshape(-1,2)
10000 loops, best of 3: 69.1 µs per loop
In [10]: %timeit initialization_based(A,B)
100000 loops, best of 3: 11.1 µs per loop
包括具有相同设置的其他帖子的方法 -
In [183]: from itertools import product
# @Chris Mueller's soln
In [184]: %timeit [x for x in product(A,B)]
1000 loops, best of 3: 503 µs per loop
# @jyotish's soln
In [185]: %timeit [[i, j] for i in A for j in B]
1000 loops, best of 3: 1.34 ms per loop
答案 1 :(得分:0)
您可以使用返回迭代器的itertools.product
。
from itertools import product
[x for x in product([1, 2, 3, 4], [5, 6, 7, 8, 9])]
[(1, 5),
(1, 6),
(1, 7),
(1, 8),
(1, 9),
(2, 5),
(2, 6),
(2, 7),
(2, 8),
(2, 9),
(3, 5),
(3, 6),
(3, 7),
(3, 8),
(3, 9),
(4, 5),
(4, 6),
(4, 7),
(4, 8),
(4, 9)]
答案 2 :(得分:0)
Pythonic方式是使用列表理解:
Given,
A=[1,2,3,4]
B=[5,6,7,8,9]
C = [[i, j] for i in A for j in B]
我强烈建议您使用timeit来测试运行时间