Question

假设我有两个1D阵列A，B

A=[1,2,3,4]
B=[5,6,7,8,9]

我想要

C=[[1,5],[1,6],[1,7],[1,8],[1,9],
   [2,5],[2,6],[2,7],[2,8],[2,9],
   [3,5],[3,6],[3,7],[3,8],[3,9],
   [4,5],[4,6],[4,7],[4,8],[4,9]]

我通过创建一个新的数组C并使用循环来插入数组

来尝试这个

C=np.empty(shape=(A.shape[0]*B.shape[0],2))
for i in range(A.shape[0]):
    C[i*B.shape[0]:(i+1)*B.shape[0],0]=A[i]
for i in range(B.shape[0]):
    C[i*A.shape[0]:(i+1)*A.shape[0],1]=B[i]

然而，我有大约50,000个案例要计算| A | * | B | = 100 * 100。有没有其他方式（numpy-thonic或pythonic）我可以提高时间复杂度？

Answer 1

方法＃1

使用np.meshgrid然后使用np.array和最终转置 -

np.array(np.meshgrid(A,B)).T.reshape(-1,2)

示例运行 -

In [3]: A
Out[3]: array([1, 2, 3, 4])

In [4]: B
Out[4]: array([5, 6, 7, 8, 9])

In [5]: np.array(np.meshgrid(A,B)).T.reshape(-1,2)
Out[5]: 
array([[1, 5],
       [1, 6],
       [1, 7],
       [1, 8],
       [1, 9],
       [2, 5],
       [2, 6],
       [2, 7],
       [2, 8],
       [2, 9],
       [3, 5],
       [3, 6],
       [3, 7],
       [3, 8],
       [3, 9],
       [4, 5],
       [4, 6],
       [4, 7],
       [4, 8],
       [4, 9]])

方法＃2

基于初始化的方法，重点关注性能，尤其是对于大型数组 -

def initialization_based(A,B):
    N = A.size
    M = B.size
    out = np.empty((N,M,2),dtype=A.dtype)
    out[...,0] = A[:,None]
    out[...,1] = B
    out.shape = (-1,2)
    return out

运行时测试

In [7]: A = np.random.randint(0,9,(100))

In [8]: B = np.random.randint(0,9,(100))

In [9]: %timeit np.array(np.meshgrid(A,B)).T.reshape(-1,2)
10000 loops, best of 3: 69.1 µs per loop

In [10]: %timeit initialization_based(A,B)
100000 loops, best of 3: 11.1 µs per loop

包括具有相同设置的其他帖子的方法 -

In [183]: from itertools import product

# @Chris Mueller's soln
In [184]: %timeit [x for x in product(A,B)]
1000 loops, best of 3: 503 µs per loop

# @jyotish's soln
In [185]: %timeit [[i, j] for i in A for j in B]
1000 loops, best of 3: 1.34 ms per loop

Answer 2

您可以使用返回迭代器的itertools.product。

from itertools import product 
[x for x in product([1, 2, 3, 4], [5, 6, 7, 8, 9])]
[(1, 5),
 (1, 6),
 (1, 7),
 (1, 8),
 (1, 9),
 (2, 5),
 (2, 6),
 (2, 7),
 (2, 8),
 (2, 9),
 (3, 5),
 (3, 6),
 (3, 7),
 (3, 8),
 (3, 9),
 (4, 5),
 (4, 6),
 (4, 7),
 (4, 8),
 (4, 9)]

Answer 3

Pythonic方式是使用列表理解：

Given,
A=[1,2,3,4]
B=[5,6,7,8,9]

C = [[i, j] for i in A for j in B]

我强烈建议您使用timeit来测试运行时间

两个1d数组（A，B）元素附加到2D数组[| A | * | B |，2]

3 个答案:

运行时测试