1.16.0中结构化到非结构化的numpy数组转换失败

时间:2019-07-24 13:21:06

标签: python numpy structured-array

我想将具有相同(np.float)类型列的NumPy结构化数组转换为Numpy 1.16.0中的非结构化数组。

以前我是这样的:

array = np.ones((100,), dtype=[('user', np.object), ('item', np.float), ('value', np.float)])
array[['item','value']].view((np.float, 2))

在1.16.0中,structured_to_unstructured函数出现在numpy.lib.recfunctions

但是对于来自带有对象列的数组的视图,新视图structured_to_unstructured和旧视图都会引发TypeError: Cannot change data-type for object array.

对于结构化数组中完全没有对象列的视图,它可以正常工作,但是如果视图仅包含由包含对象字段的数组构成的数字列,则崩溃。

1 个答案:

答案 0 :(得分:0)

在1.16版本中,多字段视图的处理发生了重大变化。您需要使用rf.repack_fields来获得更早的行为。

In [277]: import numpy.lib.recfunctions as rf 

In [287]: arr = np.ones(3, dtype='O,f,f')                                                                    
In [288]: arr                                                                                                
Out[288]: 
array([(1, 1., 1.), (1, 1., 1.), (1, 1., 1.)],
      dtype=[('f0', 'O'), ('f1', '<f4'), ('f2', '<f4')])
In [289]: rf.structured_to_unstructured(arr[['f1','f2']])                                                    
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-289-8700aa9aacb4> in <module>
----> 1 rf.structured_to_unstructured(arr[['f1','f2']])

/usr/local/lib/python3.6/dist-packages/numpy/lib/recfunctions.py in structured_to_unstructured(arr, dtype, copy, casting)
    969     with suppress_warnings() as sup:  # until 1.16 (gh-12447)
    970         sup.filter(FutureWarning, "Numpy has detected")
--> 971         arr = arr.view(flattened_fields)
    972 
    973     # next cast to a packed format with all fields converted to new dtype

/usr/local/lib/python3.6/dist-packages/numpy/core/_internal.py in _view_is_safe(oldtype, newtype)
    492 
    493     if newtype.hasobject or oldtype.hasobject:
--> 494         raise TypeError("Cannot change data-type for object array.")
    495     return
    496 

TypeError: Cannot change data-type for object array.

转换前重新包装:

In [290]: rf.structured_to_unstructured(rf.repack_fields(arr[['f1','f2']]))                                  
Out[290]: 
array([[1., 1.],
       [1., 1.],
       [1., 1.]], dtype=float32)

多字段视图保留基础数据布局。注意在此显示中使用offsets。对象字段仍然存在,只是不显示。

In [291]: arr[['f1','f2']]                                                                                   
Out[291]: 
array([(1., 1.), (1., 1.), (1., 1.)],
      dtype={'names':['f1','f2'], 'formats':['<f4','<f4'], 'offsets':[8,12], 'itemsize':16})

repack制作的副本不包含对象字段:

In [292]: rf.repack_fields(arr[['f1','f2']])                                                                 
Out[292]: array([(1., 1.), (1., 1.), (1., 1.)], dtype=[('f1', '<f4'), ('f2', '<f4')])

即使所有字段都是浮点的,view方法也存在问题:

In [301]: arr = np.ones(3, dtype='f,f,f')                                                                    
In [302]: arr[['f1','f2']].view(('f',2))                                                                     
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-302-68433a44bcfe> in <module>
----> 1 arr[['f1','f2']].view(('f',2))

ValueError: Changing the dtype to a subarray type is only supported if the total itemsize is unchanged
In [303]: arr[['f1','f2']]                                                                                   
Out[303]: 
array([(1., 1.), (1., 1.), (1., 1.)],
      dtype={'names':['f1','f2'], 'formats':['<f4','<f4'], 'offsets':[4,8], 'itemsize':12})
In [304]: rf.repack_fields(arr[['f1','f2']]).view(('f',2))                                                   
Out[304]: 
array([[1., 1.],
       [1., 1.],
       [1., 1.]], dtype=float32)