Question

我尝试将10x2数组转换为记录，方法是为每列提供名称。

我试过了：

t = arange (10)
>>> n = dstack([t,
                roll (t, 1),
                roll (t, -1)])[0]
... ... >>> 
>>> n = n[:,1:3]
>>> n
array([[9, 1],
       [0, 2],
       [1, 3],
       [2, 4],
       [3, 5],
       [4, 6],
       [5, 7],
       [6, 8],
       [7, 9],
       [8, 0]])
>>> nt = [('left', int), ('right', int)]
>>> array (n, nt)
array([[(9, 9), (1, 1)],
       [(0, 0), (2, 2)],
       [(1, 1), (3, 3)],
       [(2, 2), (4, 4)],
       [(3, 3), (5, 5)],
       [(4, 4), (6, 6)],
       [(5, 5), (7, 7)],
       [(6, 6), (8, 8)],
       [(7, 7), (9, 9)],
       [(8, 8), (0, 0)]], 
      dtype=[('left', '<i8'), ('right', '<i8')])
>>>

令我惊讶的是，每行的元素都是元组而不是int类型的数字。

如何更正此问题，并使n的每一行看起来像[ 9,1 ]而不是[(9, 9), (1, 1)]？

Answer 1

您可以使用新的dtype创建一个视图，它看起来是相同的数据：

In [150]: nt = [('left',np.int),('right',np.int)]

In [151]: n
Out[151]: 
array([[9, 1],
       [0, 2],
       [1, 3],
       [2, 4],
       [3, 5],
       [4, 6],
       [5, 7],
       [6, 8],
       [7, 9],
       [8, 0]])

In [152]: n.view(nt)
Out[152]: 
array([[(9, 1)],
       [(0, 2)],
       [(1, 3)],
       [(2, 4)],
       [(3, 5)],
       [(4, 6)],
       [(5, 7)],
       [(6, 8)],
       [(7, 9)],
       [(8, 0)]], 
      dtype=[('left', '<i8'), ('right', '<i8')])

这样可以保持2d形状：

In [160]: n_struct = n.view(nt)

In [161]: n_struct.shape
Out[161]: (10, 1)

In [162]: n_struct = n.view(nt).reshape(n.shape[0])

In [163]: n_struct
Out[163]: 
array([(9, 1), (0, 2), (1, 3), (2, 4), (3, 5), (4, 6), (5, 7), (6, 8),
       (7, 9), (8, 0)], 
      dtype=[('left', '<i8'), ('right', '<i8')])

正如您所问，访问是这样的：

In [170]: n_struct['left']
Out[170]: array([9, 0, 1, 2, 3, 4, 5, 6, 7, 8])

In [171]: n_struct['right']
Out[171]: array([1, 2, 3, 4, 5, 6, 7, 8, 9, 0])

来自@Ophion的警告是，这仅在dtypes兼容时才有效，因为ndarray.view(dtype)将原始数据解释为给定的dtype，它不会转换数据到新给定的dtype。换句话说，（来自文档），

a.view(some_dtype)使用不同的数据类型构造数组内存的视图。这可能会导致重新解释内存字节。

Answer 2

希望在纯粹的numpy中有更好的方法，但是为了让你开始：

>>> nt = [('left', int), ('right', int)]
>>> n
array([[9, 1],
       [0, 2],
       [1, 3],
       [2, 4],
       [3, 5],
       [4, 6],
       [5, 7],
       [6, 8],
       [7, 9],
       [8, 0]])

>>> out = np.array(np.zeros(n.shape[0]),nt)
>>> out
array([(0, 0), (0, 0), (0, 0), (0, 0), (0, 0), (0, 0), (0, 0), (0, 0),
       (0, 0), (0, 0)],
      dtype=[('left', '<i8'), ('right', '<i8')])

>>> out['left']=n[:,0]
>>> out['right']=n[:,1]

>>> out
array([(9, 1), (0, 2), (1, 3), (2, 4), (3, 5), (4, 6), (5, 7), (6, 8),
       (7, 9), (8, 0)],
      dtype=[('left', '<i8'), ('right', '<i8')])

>>> out['left']
array([9, 0, 1, 2, 3, 4, 5, 6, 7, 8])

当然有大熊猫回答：

>>> import pandas as pd
>>> df = pd.DataFrame(n,columns=['left','right'])
>>> df
   left  right
0     9      1
1     0      2
2     1      3
3     2      4
4     3      5
5     4      6
6     5      7
7     6      8
8     7      9
9     8      0

关于pandas数据帧的好消息：

>>> df.values
array([[9, 1],
       [0, 2],
       [1, 3],
       [2, 4],
       [3, 5],
       [4, 6],
       [5, 7],
       [6, 8],
       [7, 9],
       [8, 0]])

Answer 3

如果基础dtypes不兼容，则view方法不起作用。后备选项是使用元组列表填充记录数组：

In [128]: x=np.arange(12).reshape(4,3)

In [129]: y=np.zeros((4,),dtype=[('x','f'),('y','f'),('z','f')])

In [130]: y
Out[130]: 
array([(0.0, 0.0, 0.0), (0.0, 0.0, 0.0), (0.0, 0.0, 0.0), (0.0, 0.0, 0.0)], 
      dtype=[('x', '<f4'), ('y', '<f4'), ('z', '<f4')])

In [131]: y[:]=[tuple(row) for row in x]

In [132]: y
Out[132]: 
array([(0.0, 1.0, 2.0), (3.0, 4.0, 5.0), (6.0, 7.0, 8.0), (9.0, 10.0, 11.0)], 
      dtype=[('x', '<f4'), ('y', '<f4'), ('z', '<f4')])

这个元组列表可用于初始构造：

In [135]: np.array([tuple(row) for row in x],y.dtype)
Out[135]: 
array([(0.0, 1.0, 2.0), (3.0, 4.0, 5.0), (6.0, 7.0, 8.0), (9.0, 10.0, 11.0)], 
      dtype=[('x', '<f4'), ('y', '<f4'), ('z', '<f4')])

将numpy数组转换为numpy记录数组

3 个答案: