如果要将不同dtypes的向量组合到二维numpy数组中,可以使用
structured array
或record array
。 何时应使用1.,何时应使用2.? 它们在性能,便利性方面是否表现相同?
record array
是用更少的代码创建的,但是structured array
是否具有其他一些优点,使其比其他代码更受欢迎?
代码示例:
import numpy as np
a = np.array([['2018-04-01T15:30:00'],
['2018-04-01T15:31:00'],
['2018-04-01T15:32:00'],
['2018-04-01T15:33:00'],
['2018-04-01T15:34:00']], dtype='datetime64[s]')
c = np.array([0,1,2,3,4]).reshape(-1,1)
structured array
:(请参阅:How to insert column of different type to numpy array?)
# create the compound dtype
dtype = np.dtype(dict(names=['date', 'val'], formats=[arr.dtype for arr in (a, c)]))
# create an empty structured array
struct = np.empty(a.shape[0], dtype=dtype)
# populate the structured array with the data from your column arrays
struct['date'], struct['val'] = a.T, c.T
print(struct)
# output:
# array([('2018-04-01T15:30:00', 0), ('2018-04-01T15:31:00', 1),
# ('2018-04-01T15:32:00', 2), ('2018-04-01T15:33:00', 3),
# ('2018-04-01T15:34:00', 4)],
# dtype=[('date', '<M8[s]'), ('val', '<i8')])
record array
:(请参阅How can I avoid that np.datetime64 gets auto converted to datetime when adding it to a numpy array?)
rarr = np.rec.fromarrays([a, c], names=('date', 'val'))
print(rarr)
# output
# rec.array([[('2018-04-01T15:30:00', 0)],
# [('2018-04-01T15:31:00', 1)],
# [('2018-04-01T15:32:00', 2)],
# [('2018-04-01T15:33:00', 3)],
# [('2018-04-01T15:34:00', 4)]],
# dtype=[('date', '<M8[s]'), ('val', '<i8')])