两个numpy数组中所有行的组合

时间:2017-11-06 18:48:21

标签: python arrays numpy

我有两个数组,例如形状为(3,2),另一个形状为(10,7)。我想要两个数组的所有组合,这样我最终得到一个9列数组。换句话说,我希望第一个数组的每一行的所有组合与第二个数组的行。

我该怎么做?据我所知,我没有正确使用meshgrid。

根据以前的帖子,我的印象是

a1 = np.zeros((10,7))
a2 = np.zeros((3,2))
r = np.array(np.meshgrid(a1, a2)).T.reshape(-1, a1.shape[1] + a2.shape[1])

可行,但这给我的尺寸为(84,10)。

3 个答案:

答案 0 :(得分:3)

方法#1

关注绩效这里有array-initializationelement-broadcasting分配的一种方法 -

m1,n1 = a1.shape
m2,n2 = a2.shape
out = np.zeros((m1,m2,n1+n2),dtype=int)
out[:,:,:n1] = a1[:,None,:]
out[:,:,n1:] = a2
out.shape = (m1*m2,-1)

说明:

诀窍在于两个步骤:

out[:,:,:n1] = a1[:,None,:]
out[:,:,n1:] = a2

第1步:

In [227]: np.random.seed(0)

In [228]: a1 = np.random.randint(1,9,(3,2))

In [229]: a2 = np.random.randint(1,9,(2,7))

In [230]: m1,n1 = a1.shape
     ...: m2,n2 = a2.shape
     ...: out = np.zeros((m1,m2,n1+n2),dtype=int)
     ...: 

In [231]: out[:,:,:n1] = a1[:,None,:]

In [232]: out[:,:,:n1]
Out[232]: 
array([[[5, 8],
        [5, 8]],

       [[6, 1],
        [6, 1]],

       [[4, 4],
        [4, 4]]])

In [233]: a1[:,None,:]
Out[233]: 
array([[[5, 8]],

       [[6, 1]],

       [[4, 4]]])

所以,基本上我们分配a1的元素,保持第一轴与输出的相应一个轴对齐,同时让输出数组的第二轴上的元素以对应的广播方式填充newaxis沿着该轴添加了a1。这是关键所在并带来性能,因为我们没有分配额外的内存空间,否则我们需要使用明确的重复/平铺方法。

第2步:

In [237]: out[:,:,n1:] = a2

In [238]: out[:,:,n1:]
Out[238]: 
array([[[4, 8, 2, 4, 6, 3, 5],
        [8, 7, 1, 1, 5, 3, 2]],

       [[4, 8, 2, 4, 6, 3, 5],
        [8, 7, 1, 1, 5, 3, 2]],

       [[4, 8, 2, 4, 6, 3, 5],
        [8, 7, 1, 1, 5, 3, 2]]])

In [239]: a2
Out[239]: 
array([[4, 8, 2, 4, 6, 3, 5],
       [8, 7, 1, 1, 5, 3, 2]])

在这里,我们基本上沿着输出数组的第一轴广播 a2,而不是明确地重复复制。

示例输入,输出完整性 -

In [242]: a1
Out[242]: 
array([[5, 8],
       [6, 1],
       [4, 4]])

In [243]: a2
Out[243]: 
array([[4, 8, 2, 4, 6, 3, 5],
       [8, 7, 1, 1, 5, 3, 2]])

In [244]: out
Out[244]: 
array([[[5, 8, 4, 8, 2, 4, 6, 3, 5],
        [5, 8, 8, 7, 1, 1, 5, 3, 2]],

       [[6, 1, 4, 8, 2, 4, 6, 3, 5],
        [6, 1, 8, 7, 1, 1, 5, 3, 2]],

       [[4, 4, 4, 8, 2, 4, 6, 3, 5],
        [4, 4, 8, 7, 1, 1, 5, 3, 2]]])

方法#2

另一位tiling/repeating -

parte1 = np.repeat(a1[:,None,:],m2,axis=0).reshape(-1,m2)
parte2 = np.repeat(a2[None],m1,axis=0).reshape(-1,n2)
out = np.c_[parte1, parte2] 

答案 1 :(得分:0)

包含np.tilenp.repeat的解决方案:

a1 = np.arange(20).reshape(5,4)
a2 = np.arange(6).reshape(3,2)

res=hstack((np.tile(a1,(len(a2),1)),np.repeat(a2,len(a1),0)))

# array([[ 0,  1,  2,  3,  0,  1],
#        [ 4,  5,  6,  7,  0,  1],
#        [ 8,  9, 10, 11,  0,  1],
#        [12, 13, 14, 15,  0,  1],
#        [16, 17, 18, 19,  0,  1],
#        [ 0,  1,  2,  3,  2,  3],
#        [ 4,  5,  6,  7,  2,  3],
#        [ 8,  9, 10, 11,  2,  3],
#        [12, 13, 14, 15,  2,  3],
#        [16, 17, 18, 19,  2,  3],
#        [ 0,  1,  2,  3,  4,  5],
#        [ 4,  5,  6,  7,  4,  5],
#        [ 8,  9, 10, 11,  4,  5],
#        [12, 13, 14, 15,  4,  5],
#        [16, 17, 18, 19,  4,  5]])

答案 2 :(得分:0)

可以使用

meshgrid,但间接生成行索引:

In [796]: A = np.arange(6).reshape(3,2)
In [797]: B = np.arange(12).reshape(4,3)*10    # reduced size

2个数组的行的混合索引:

In [798]: idx=np.meshgrid(np.arange(3), np.arange(4),indexing='ij')
In [799]: idx
Out[799]: 
[array([[0, 0, 0, 0],
        [1, 1, 1, 1],
        [2, 2, 2, 2]]), 
 array([[0, 1, 2, 3],
        [0, 1, 2, 3],
        [0, 1, 2, 3]])]

这会多次复制A行;同样适用于B

In [800]: A[idx[0],:]
Out[800]: 
array([[[0, 1],
        [0, 1],
        [0, 1],
        [0, 1]],

       [[2, 3],
        [2, 3],
        [2, 3],
        [2, 3]],

       [[4, 5],
        [4, 5],
        [4, 5],
        [4, 5]]])

现在在最后一个维度上连接它们,生成一个(3,4,5)数组。最后重塑为(12,5):

In [802]: np.concatenate((A[idx[0],:],B[idx[1],:]), axis=-1).reshape(12,5)
Out[802]: 
array([[  0,   1,   0,  10,  20],
       [  0,   1,  30,  40,  50],
       [  0,   1,  60,  70,  80],
       [  0,   1,  90, 100, 110],
       [  2,   3,   0,  10,  20],
       [  2,   3,  30,  40,  50],
       [  2,   3,  60,  70,  80],
       [  2,   3,  90, 100, 110],
       [  4,   5,   0,  10,  20],
       [  4,   5,  30,  40,  50],
       [  4,   5,  60,  70,  80],
       [  4,   5,  90, 100, 110]])