如何将1D字符串数组转换为关于日期时间的2D数组

时间:2018-03-13 07:12:39

标签: python numpy datetime

考虑我有一个包含datetime字符串的数组:

       new_time[index]
Out[9]: 
array(['2012-09-01_00:00:00', '2012-09-01_01:00:00',
    '2012-09-01_02:00:00', '2012-09-01_03:00:00',
    '2012-09-01_04:00:00', '2012-09-01_05:00:00',
    '2012-09-01_06:00:00', '2012-09-01_07:00:00',
    '2012-09-01_08:00:00', '2012-09-01_09:00:00',
    '2012-09-01_10:00:00', '2012-09-01_11:00:00',
    '2012-09-01_12:00:00', '2012-09-01_13:00:00',
    '2012-09-01_14:00:00', '2012-09-01_15:00:00',
    '2012-09-01_16:00:00', '2012-09-01_17:00:00',
    '2012-09-01_18:00:00', '2012-09-01_19:00:00',
    '2012-09-01_20:00:00', '2012-09-01_21:00:00',
    '2012-09-01_22:00:00', '2012-09-01_23:00:00'], dtype='<U19')

它的形状是(24,)。问题是如何将它分配给(24,19)数组,新数组的行可能如下所示:

 ## one row of new array 
Out[10]: 
array([[b'2', b'0', b'1', b'2', b'-', b'0', b'9', b'-', b'0', b'1', b'_',
    b'0', b'0', b':', b'0', b'0', b':', b'0', b'0']], dtype='|S1')

感谢您的帮助。

4 个答案:

答案 0 :(得分:2)

对于你的阵列:

import numpy as np

a = np.array(['2012-09-01_00:00:00', '2012-09-01_01:00:00',
    '2012-09-01_02:00:00', '2012-09-01_03:00:00',
    '2012-09-01_04:00:00', '2012-09-01_05:00:00',
    '2012-09-01_06:00:00', '2012-09-01_07:00:00',
    '2012-09-01_08:00:00', '2012-09-01_09:00:00',
    '2012-09-01_10:00:00', '2012-09-01_11:00:00',
    '2012-09-01_12:00:00', '2012-09-01_13:00:00',
    '2012-09-01_14:00:00', '2012-09-01_15:00:00',
    '2012-09-01_16:00:00', '2012-09-01_17:00:00',
    '2012-09-01_18:00:00', '2012-09-01_19:00:00',
    '2012-09-01_20:00:00', '2012-09-01_21:00:00',
    '2012-09-01_22:00:00', '2012-09-01_23:00:00'], dtype='<U19')

您需要转到S1并重新塑造:

>>> a.view('U1').astype('S1').reshape(a.size, -1)
array([[b'2', b'0', b'1', b'2', b'-', b'0', b'9', b'-', b'0', b'1', b'_',
        b'0', b'0', b':', b'0', b'0', b':', b'0', b'0'],
       [b'2', b'0', b'1', b'2', b'-', b'0', b'9', b'-', b'0', b'1', b'_',
        b'0', b'1', b':', b'0', b'0', b':', b'0', b'0'],
       ...
       [b'2', b'0', b'1', b'2', b'-', b'0', b'9', b'-', b'0', b'1', b'_',
        b'2', b'3', b':', b'0', b'0', b':', b'0', b'0']], 
      dtype='|S1')

直接以S1查看不起作用,因为每个字符有4个字节:

>>> a.view('S1').shape
(1824,)
>>> a.view('U1').shape
(456,)

我从S19开始,您可以立即查看S1

>>> b.dtype
dtype('S19')
>>> b.view('S1').reshape(b.size, -1)
array([[b'2', b'0', b'1', b'2', b'-', b'0', b'9', b'-', b'0', b'1', b'_',
        b'0', b'0', b':', b'0', b'0', b':', b'0', b'0'],
       ...
       [b'2', b'0', b'1', b'2', b'-', b'0', b'9', b'-', b'0', b'1', b'_',
        b'2', b'3', b':', b'0', b'0', b':', b'0', b'0']], 
      dtype='|S1')

答案 1 :(得分:1)

如果您对不连续的视图没问题,可以这样做:

X.view('S1').reshape(X.size, -1, 4)[..., 0]

X.view('S1').reshape(X.size, -1)[:, ::4]

由于它与原始数组共享数据,因此非常便宜,但您必须注意,就地修改此内容也会更改原始数组。当然,你总是可以复制。

答案 2 :(得分:0)

您可以使用列表理解来拆分字符串。然后,您可以使用np.asarray()作为

来获取2D数组
x = np.asarray(['2012-09-01_00:00:00', '2012-09-01_01:00:00',
    '2012-09-01_02:00:00', '2012-09-01_03:00:00',
    '2012-09-01_04:00:00', '2012-09-01_05:00:00',
    '2012-09-01_06:00:00', '2012-09-01_07:00:00',
    '2012-09-01_08:00:00', '2012-09-01_09:00:00',
    '2012-09-01_10:00:00', '2012-09-01_11:00:00',
    '2012-09-01_12:00:00', '2012-09-01_13:00:00',
    '2012-09-01_14:00:00', '2012-09-01_15:00:00',
    '2012-09-01_16:00:00', '2012-09-01_17:00:00',
    '2012-09-01_18:00:00', '2012-09-01_19:00:00',
    '2012-09-01_20:00:00', '2012-09-01_21:00:00',
    '2012-09-01_22:00:00', '2012-09-01_23:00:00'])

temp = []
for i in x:
    temp.append([j for j in i])
np.asarray(temp, dtype = 'S1')

或者以非常简洁的方式做到

temp = [[j for j in i] for i in x]   
temp = np.asarray(temp, dtype = 'S1')

答案 3 :(得分:0)

遍历每个值,然后将其分配给列表,将解决此问题。

import numpy as np
array_24 = np.array(['2012-09-01_00:00:00', '2012-09-01_01:00:00',
    '2012-09-01_02:00:00', '2012-09-01_03:00:00',
    '2012-09-01_04:00:00', '2012-09-01_05:00:00',
    '2012-09-01_06:00:00', '2012-09-01_07:00:00',
    '2012-09-01_08:00:00', '2012-09-01_09:00:00',
    '2012-09-01_10:00:00', '2012-09-01_11:00:00',
    '2012-09-01_12:00:00', '2012-09-01_13:00:00',
    '2012-09-01_14:00:00', '2012-09-01_15:00:00',
    '2012-09-01_16:00:00', '2012-09-01_17:00:00',
    '2012-09-01_18:00:00', '2012-09-01_19:00:00',
    '2012-09-01_20:00:00', '2012-09-01_21:00:00',
    '2012-09-01_22:00:00', '2012-09-01_23:00:00'])
array_24.shape        #(24,)
array_24_19 = np.asarray([[j for j in i] for i in array_24])
array_24_19.shape     #(24, 19)
array_24_19[0]        #array(['2', '0', '1', '2', '-', '0', '9', '-','0','1', '_', '0', '0',':', '0', '0', ':', '0', '0'], dtype='|S1')

我希望这会有所帮助