转换"略有不一致" Pandas Column to DateTime

时间:2016-08-09 02:11:59

标签: python datetime pandas

我有来自不同来源的一列数据,因此时间戳字符串上存在轻微的不一致问题:

data_test       DataTime
0    2012-10-03 12:14:18.257000000
1    2012-10-01 08:39:54.633000000
2    2012-10-05 07:50:14.203000000
3    2012-10-02 15:02:42.843000000
4              2012-10-02 09:02:13
5              2012-10-02 09:02:13
6              2012-10-09 11:00:36
7              2012-10-09 11:00:36

某些'秒是整数,有些是浮点数,因此以下两种方法都会失败:

import datetime as dt
#Method 1: consider the float
data_test['DataTime'] = data_test['DataTime'].apply(lambda x: dt.datetime.strptime(x, '%Y-%m-%d %H:%M:%S.%f'))
#Method 2: ignore the float
data_test['DataTime'] = data_test['DataTime'].apply(lambda x: dt.datetime.strptime(x, '%Y-%m-%d %H:%M:%S'))

我可以将此列转换为DateTime吗?

1 个答案:

答案 0 :(得分:1)

您可以使用to_datetime()方法:

In [222]: df
Out[222]:
                        DataTime
0  2012-10-03 12:14:18.257000000
1  2012-10-01 08:39:54.633000000
2  2012-10-05 07:50:14.203000000
3  2012-10-02 15:02:42.843000000
4            2012-10-02 09:02:13
5            2012-10-02 09:02:13
6            2012-10-09 11:00:36
7            2012-10-09 11:00:36

In [223]: df.dtypes
Out[223]:
DataTime    object
dtype: object

In [224]: df.DataTime = pd.to_datetime(df.DataTime)

In [225]: df
Out[225]:
                 DataTime
0 2012-10-03 12:14:18.257
1 2012-10-01 08:39:54.633
2 2012-10-05 07:50:14.203
3 2012-10-02 15:02:42.843
4 2012-10-02 09:02:13.000
5 2012-10-02 09:02:13.000
6 2012-10-09 11:00:36.000
7 2012-10-09 11:00:36.000

In [226]: df.dtypes
Out[226]:
DataTime    datetime64[ns]
dtype: object