Question

如果要将不同dtypes的向量组合到二维numpy数组中，可以使用

numpy structured array或
numpy record array。

何时应使用1.，何时应使用2.？它们在性能，便利性方面是否表现相同？

record array是用更少的代码创建的，但是structured array是否具有其他一些优点，使其比其他代码更受欢迎？

代码示例：

import numpy as np
a = np.array([['2018-04-01T15:30:00'],
       ['2018-04-01T15:31:00'],
       ['2018-04-01T15:32:00'],
       ['2018-04-01T15:33:00'],
       ['2018-04-01T15:34:00']], dtype='datetime64[s]')
c = np.array([0,1,2,3,4]).reshape(-1,1)

structured array：

（请参阅：How to insert column of different type to numpy array?）

# create the compound dtype
dtype = np.dtype(dict(names=['date', 'val'], formats=[arr.dtype for arr in (a, c)]))

# create an empty structured array
struct = np.empty(a.shape[0], dtype=dtype)

# populate the structured array with the data from your column arrays
struct['date'], struct['val'] = a.T, c.T

print(struct)
# output:
#     array([('2018-04-01T15:30:00', 0), ('2018-04-01T15:31:00', 1),
#            ('2018-04-01T15:32:00', 2), ('2018-04-01T15:33:00', 3),
#            ('2018-04-01T15:34:00', 4)],
#           dtype=[('date', '<M8[s]'), ('val', '<i8')])

record array：

（请参阅How can I avoid that np.datetime64 gets auto converted to datetime when adding it to a numpy array?）

rarr = np.rec.fromarrays([a, c], names=('date', 'val'))

print(rarr)
# output
#     rec.array([[('2018-04-01T15:30:00', 0)],
#                [('2018-04-01T15:31:00', 1)],
#                [('2018-04-01T15:32:00', 2)],
#                [('2018-04-01T15:33:00', 3)],
#                [('2018-04-01T15:34:00', 4)]],
#               dtype=[('date', '<M8[s]'), ('val', '<i8')])

何时使用numpy结构或numpy记录数组？

0 个答案: