切片numpy阵列,使其成为理想的形状

时间:2016-09-21 16:37:46

标签: python arrays numpy

令人惊讶的是,无法通过互联网找到答案。我有一个n维numpy数组。例如:2-D np阵列:

array([['34.5500000', '36.9000000', '37.3200000', '37.6700000'],
       ['41.7900000', '44.8000000', '48.2600000', '46.1800000'],
       ['36.1200000', '37.1500000', '39.3100000', '38.1000000'],
       ['82.1000000', '82.0900000', '76.0200000', '77.7000000'],
       ['48.0100000', '51.2500000', '51.1700000', '52.5000000', '55.2500000'],
       ['39.7500000', '39.5000000', '36.8100000', '37.2500000']], dtype=object)

正如你所看到的,第5行包括5个元素和我想使第5行消失,使用类似的东西:

np.slice(MyArray, [6,4]) 

[6,4]是一个形状。我真的不想迭代投掷尺寸并削减它们。我尝试了resize方法,但它什么也没有返回!

2 个答案:

答案 0 :(得分:1)

这不是二维数组。它是一个1d数组,其元素是对象,在这种情况下是一些4个元素列表和一个5个元素列表。这个列表包含字符串。

In [577]: np.array([['34.5500000', '36.9000000', '37.3200000', '37.6700000'],
     ...:        ['41.7900000', '44.8000000', '48.2600000', '46.1800000'],
     ...:        ['36.1200000', '37.1500000', '39.3100000', '38.1000000'],
     ...:        ['82.1000000', '82.0900000', '76.0200000', '77.7000000'],
     ...:        ['48.0100000', '51.2500000', '51.1700000', '52.5000000', '55.25
     ...: 00000'],
     ...:        ['39.7500000', '39.5000000', '36.8100000', '37.2500000']], dtyp
     ...: e=object)
Out[577]: 
array([['34.5500000', '36.9000000', '37.3200000', '37.6700000'],
       ['41.7900000', '44.8000000', '48.2600000', '46.1800000'],
       ['36.1200000', '37.1500000', '39.3100000', '38.1000000'],
       ['82.1000000', '82.0900000', '76.0200000', '77.7000000'],
       ['48.0100000', '51.2500000', '51.1700000', '52.5000000', '55.2500000'],
       ['39.7500000', '39.5000000', '36.8100000', '37.2500000']], dtype=object)
In [578]: MyArray=_
In [579]: MyArray.shape
Out[579]: (6,)
In [580]: MyArray[0]
Out[580]: ['34.5500000', '36.9000000', '37.3200000', '37.6700000']
In [581]: MyArray[5]
Out[581]: ['39.7500000', '39.5000000', '36.8100000', '37.2500000']
In [582]: MyArray[4]
Out[582]: ['48.0100000', '51.2500000', '51.1700000', '52.5000000', '55.2500000']
In [583]: 

slice,你需要迭代数组的元素

In [584]: [d[:4] for d in MyArray]
Out[584]: 
[['34.5500000', '36.9000000', '37.3200000', '37.6700000'],
 ['41.7900000', '44.8000000', '48.2600000', '46.1800000'],
 ['36.1200000', '37.1500000', '39.3100000', '38.1000000'],
 ['82.1000000', '82.0900000', '76.0200000', '77.7000000'],
 ['48.0100000', '51.2500000', '51.1700000', '52.5000000'],
 ['39.7500000', '39.5000000', '36.8100000', '37.2500000']]

现在所有子列表的长度相同,np.array将创建一个二维数组:

In [585]: np.array(_)
Out[585]: 
array([['34.5500000', '36.9000000', '37.3200000', '37.6700000'],
       ['41.7900000', '44.8000000', '48.2600000', '46.1800000'],
       ['36.1200000', '37.1500000', '39.3100000', '38.1000000'],
       ['82.1000000', '82.0900000', '76.0200000', '77.7000000'],
       ['48.0100000', '51.2500000', '51.1700000', '52.5000000'],
       ['39.7500000', '39.5000000', '36.8100000', '37.2500000']], 
      dtype='<U10')

仍然是字符串,但

In [586]: np.array(__,dtype=float)
Out[586]: 
array([[ 34.55,  36.9 ,  37.32,  37.67],
       [ 41.79,  44.8 ,  48.26,  46.18],
       [ 36.12,  37.15,  39.31,  38.1 ],
       [ 82.1 ,  82.09,  76.02,  77.7 ],
       [ 48.01,  51.25,  51.17,  52.5 ],
       [ 39.75,  39.5 ,  36.81,  37.25]])

答案 1 :(得分:0)

这是一种几乎*矢量化的方法 -

def slice_2Dobject_arr(arr,out_shape):
    lens = np.array(map(len,arr))
    id_arr = np.ones(lens.sum(),dtype=int)
    id_arr[lens[:-1].cumsum()] = -lens[:-1]+1
    mask = id_arr.cumsum()<=out_shape[1]
    vals = np.concatenate(arr)
    return vals[mask].reshape(-1,out_shape[1])[:out_shape[0]]

*:几乎是因为在开始时使用map来获取输入数组中列表的长度,这似乎不是矢量化操作。但是,计算上应该相对可以忽略不计。

样品运行 -

In [92]: arr
Out[92]: array([[3, 4, 5, 3], [3, 7, 8], [4, 9, 6, 4, 2], [3, 9, 4]], dtype=object)

In [93]: slice_2Dobject_arr(arr,(4,3))
Out[93]: 
array([[3, 4, 5],
       [3, 7, 8],
       [4, 9, 6],
       [3, 9, 4]])

In [94]: slice_2Dobject_arr(arr,(3,3))
Out[94]: 
array([[3, 4, 5],
       [3, 7, 8],
       [4, 9, 6]])

In [95]: slice_2Dobject_arr(arr,(3,2))
Out[95]: 
array([[3, 4],
       [3, 7],
       [4, 9]])