如何将np.datetime64添加到numpy数组中时自动将其转换为datetime?

时间:2019-01-20 18:27:07

标签: python numpy

对于以下示例,具有dtype np.datetime64的元素在添加到另一个numpy数组时会自动转换为datetime.datetime

如何避免这种自动转换?

import numpy as np
a = np.array([['2018-04-01T15:30:00'],
       ['2018-04-01T15:31:00'],
       ['2018-04-01T15:32:00'],
       ['2018-04-01T15:33:00'],
       ['2018-04-01T15:34:00']], dtype='datetime64[s]')
c = np.array([0,1,2,3,4]).reshape(-1,1)
c = c.astype("object")
d = np.append(c,a,axis=1)
d

array([[0, datetime.datetime(2018, 4, 1, 15, 30)],
       [1, datetime.datetime(2018, 4, 1, 15, 31)],
       [2, datetime.datetime(2018, 4, 1, 15, 32)],
       [3, datetime.datetime(2018, 4, 1, 15, 33)],
       [4, datetime.datetime(2018, 4, 1, 15, 34)]], dtype=object)

2 个答案:

答案 0 :(得分:3)

有时我们必须制作一个“空白”对象数组,并将其逐个填充。

In [57]: d = np.empty((5,2), object)
In [58]: d
Out[58]: 
array([[None, None],
       [None, None],
       [None, None],
       [None, None],
       [None, None]], dtype=object)

我们可以按列填充它,但结果与concatenate相同(不要使用np.append):

In [59]: d[:,0] = c.ravel()
In [60]: d[:,1] = a.ravel()
In [61]: d
Out[61]: 
array([[0, datetime.datetime(2018, 4, 1, 15, 30)],
       [1, datetime.datetime(2018, 4, 1, 15, 31)],
       [2, datetime.datetime(2018, 4, 1, 15, 32)],
       [3, datetime.datetime(2018, 4, 1, 15, 33)],
       [4, datetime.datetime(2018, 4, 1, 15, 34)]], dtype=object)

a.astype(object)一样,它已“取消装箱”日期。

但是如果我一一分配元素:

In [62]: for i in range(5):
    ...:     d[i,1]=a[i,0]
    ...:     
In [63]: d
Out[63]: 
array([[0, numpy.datetime64('2018-04-01T15:30:00')],
       [1, numpy.datetime64('2018-04-01T15:31:00')],
       [2, numpy.datetime64('2018-04-01T15:32:00')],
       [3, numpy.datetime64('2018-04-01T15:33:00')],
       [4, numpy.datetime64('2018-04-01T15:34:00')]], dtype=object)

但是这样的数组有什么价值?

我可以将timedelta添加到原始时间数组:

In [67]: a + np.array(10, 'timedelta64[m]')
Out[67]: 
array([['2018-04-01T15:40:00'],
       ['2018-04-01T15:41:00'],
       ['2018-04-01T15:42:00'],
       ['2018-04-01T15:43:00'],
       ['2018-04-01T15:44:00']], dtype='datetime64[s]')

但是我不能对对象数组列做相同的事情:

In [68]: d[:,1] + np.array(10, 'timedelta64[m]')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-68-f82827d3d355> in <module>()
----> 1 d[:,1] + np.array(10, 'timedelta64[m]')

TypeError: ufunc add cannot use operands with types dtype('O') and dtype('<m8[m]')

我必须显式地迭代对象:

In [70]: for i in range(5):
    ...:     d[i,1] += np.array(i*10, 'timedelta64[m]')
    ...:     
In [71]: d
Out[71]: 
array([[0, numpy.datetime64('2018-04-01T15:30:00')],
       [1, numpy.datetime64('2018-04-01T15:41:00')],
       [2, numpy.datetime64('2018-04-01T15:52:00')],
       [3, numpy.datetime64('2018-04-01T16:03:00')],
       [4, numpy.datetime64('2018-04-01T16:14:00')]], dtype=object)

答案 1 :(得分:1)

使用记录数组代替dtype=object

通过构造一个可以正确处理不同类型列的数组来解决此问题。最简单的方法是制作record array,就像这样:

rarr = np.rec.fromarrays([a, c], names=('date', 'val'))

print(rarr)
# output
#     rec.array([[('2018-04-01T15:30:00', 0)],
#                [('2018-04-01T15:31:00', 1)],
#                [('2018-04-01T15:32:00', 2)],
#                [('2018-04-01T15:33:00', 3)],
#                [('2018-04-01T15:34:00', 4)]],
#               dtype=[('date', '<M8[s]'), ('val', '<i8')])

print(rarr.date)
# output
#     array([['2018-04-01T15:30:00'],
#            ['2018-04-01T15:31:00'],
#            ['2018-04-01T15:32:00'],
#            ['2018-04-01T15:33:00'],
#            ['2018-04-01T15:34:00']], dtype='datetime64[s]')

正如hpaulj指出的,无论您做什么,都无法添加(或以其他方式轻松地操作)datetime64数组中的dtype=object列。但是,使用记录数组很容易做到这一点:

print(rarr.date + np.array(10, 'timedelta64[m]'))
# output
#     array([['2018-04-01T15:40:00'],
#            ['2018-04-01T15:41:00'],
#            ['2018-04-01T15:42:00'],
#            ['2018-04-01T15:43:00'],
#            ['2018-04-01T15:44:00']], dtype='datetime64[s]')