将较小的numpy.ndarray添加到更大的numpy.ndarray中

时间:2017-04-06 15:19:55

标签: python arrays numpy

修改 我想这里的实际问题是我附加到列表中的某个维度,这会导致形状不一致。第零轴的长度对于每个维度都不相同。

所以我真正想做的是

>>> a = np.asarray([np.empty((0,3,3)) for i in range(4)])
>>> print a.shape
(4, 0, 3, 3)
>>> a = np.append(a,np.random.randint(4,size=(4,1,3,3)))
>>> print a.shape
(36,)

我必须添加更小的numpy.ndarrays,但是希望避免将其重塑为(4,x,3,3),其中x是递增值。

为什么我对此感兴趣?:

我尝试使用np.empty数组列表做同样的事情,但这会消耗太多内存..所以我需要一个使用更少内存的解决方案,从不同的帖子看起来可能与numpy数组。

3 个答案:

答案 0 :(得分:1)

我认为这就是你的目的。

>>> compl = np.asarray([np.empty((1,3,3)) for i in range (4)])
>>> print(compl.shape)
(4, 1, 3, 3)
>>> compl = np.append(compl,np.random.randint(5,size=(1,3,3)))

在这里,您需要将其附加到整个数组compl而不是第一个元素compl[0]。但它改变了形状。

>>> print(compl.shape)
(45,)
>>> print(compl.reshape((5, 1, 3, 3)))

使用reshape重新获得形状。

修改

由于您要更改第二维,请使用axis=1

a = np.append(a,np.random.randint(4,size=(4,1,3,3)), axis=1)

由于axis参数,这本身就会保留其形状。

>>> a.shape
(4, 2, 3, 3)

答案 1 :(得分:0)

您正在创建空(0大小)数组。

>>> print np.asarray([np.empty((0,3,3)) for i in range (4)])
[]

此外,您的错误是在分配给数组切片时引起的,而不是在连接上引起的。

 >>> tmp = np.append(compl[0],np.random.randint(5,size=(1,3,3)))
 >>> print tmp # this is perfectly valid
[  1.48219694e-323   1.48219694e-323   1.48219694e-323   4.94065646e-324
   0.00000000e+000   9.88131292e-324   1.48219694e-323   1.48219694e-323
   4.94065646e-324   2.00000000e+000   4.00000000e+000   0.00000000e+000
   1.00000000e+000   4.00000000e+000   2.00000000e+000   4.00000000e+000
   3.00000000e+000   4.00000000e+000]
>>> compl[0] = tmp
ValueErrorTraceback (most recent call last)
<ipython-input-20-63e5075d550e> in <module>()
----> 1 compl[0] = tmp
ValueError: could not broadcast input array from shape (18) into shape (1,3,3)

我不太明白你为什么要做这样的事情,但我想你想做的是以下。

>>> compl = np.append(compl[0],np.random.randint(5,size=(1,3,3)), axis=0)
>>> print compl.shape
(2, 3, 3)

答案 2 :(得分:0)

A couple of array basics

  • array size is fixed when created; changing size requires making a new array
  • arrays are 'rectangular'; no dimension can be ragged.

compl can be made without the list comprehension:

In [150]: compl=np.zeros((4,0,3,3))
In [151]: compl
Out[151]: array([], shape=(4, 0, 3, 3), dtype=float64)

Because of the size 0 dimension, there are no values in this array. It has the 4d shape, but 0 elements.

Your append statement produces a 1d array:

In [152]: np.append(compl[0],np.random.randint(5,size=(1,3,3)))
Out[152]: array([ 2.,  0.,  2.,  2.,  4.,  1.,  0.,  1.,  2.])
In [153]: _.shape
Out[153]: (9,)

Why 1d when the inputs are 3d? Read the docs: Ifaxisis None,outis a flattened array.

You'd get more control using np.concatenate:

In [155]: compl[0].shape
Out[155]: (0, 3, 3)
In [156]: np.concatenate([compl[0],np.random.randint(5,size=(1,3,3))])
Out[156]: 
array([[[ 4.,  4.,  0.],
        [ 0.,  2.,  3.],
        [ 3.,  0.,  2.]]])
In [157]: _.shape
Out[157]: (1, 3, 3)

This is just another reason why I discourage the use of np.append. It behaves as expected when adding one value to a 1d array. Almost all other uses give problems:

In [159]: np.append(np.arange(4),100)
Out[159]: array([  0,   1,   2,   3, 100])

I was going to say that even with the correct concatenate, assignment to compl[0] would produce an error. Actually it does run, the the result is unexpected:

In [161]: compl[0]=np.concatenate([compl[0],np.random.randint(5,size=(1,3,3))])
In [162]: compl.shape     # unchanged, and not ragged
Out[162]: (4, 0, 3, 3)
In [163]: compl[0].shape
Out[163]: (0, 3, 3)

We tell it to assign a (1,3,3) array to a (0,3,3) slot (in the (4,0,3,3) array). Because of the broadcasting rules, that works. Size 1 dimensions are flexible, and can be replicated to match the target. Here the target is size 0.

2nd lesson - size 0 dimensions can produce unexpected (though logically correct) results.


Your edit is still running into that flatten behavior in append. The correct way to use append (though I still consider it wrong):

In [170]: compl=np.zeros((4,0,3,3), int)
In [171]: np.append(compl, np.random.randint(4, size=(4,1,3,3)), axis=1)
Out[171]: 
array([[[[2, 0, 2],
         [0, 1, 0],
         [0, 1, 2]]],

       ....

       [[[1, 3, 3],
         [2, 3, 3],
         [0, 3, 0]]]])
In [172]: _.shape
Out[172]: (4, 1, 3, 3)
In [173]: compl=np.append(compl, np.random.randint(4, size=(4,1,3,3)), axis=1)
In [174]: compl=np.append(compl, np.random.randint(4, size=(4,1,3,3)), axis=1)
In [175]: compl.shape
Out[175]: (4, 2, 3, 3)

A better way to construct such an array is to collect subarrays in a list, and assemble them into one array at the end:

Using a list comprehension (or regular loop with list append):

In [176]: alist = [np.random.randint(4,size=(4,3,3)) for _ in range(5)]
In [177]: len(alist)
Out[177]: 5

np.array assembles them with the new dimension at the start:

In [178]: np.array(alist).shape
Out[178]: (5, 4, 3, 3)

np.stack gives more control over the axis:

In [179]: np.stack(alist, axis=1).shape
Out[179]: (4, 5, 3, 3)

Or build the subarrays with the extra dimension, and use concatenate:

In [180]: alist = [np.random.randint(4,size=(4,1,3,3)) for _ in range(5)]
In [181]: np.concatenate(alist, axis=1).shape
Out[181]: (4, 5, 3, 3)

concatenate (and relatives) all take a list of arrays, so they don't need to be (and shouldn't) applied iteratively. List append is faster.