我有来自不同来源的一列数据,因此时间戳字符串上存在轻微的不一致问题:
data_test DataTime
0 2012-10-03 12:14:18.257000000
1 2012-10-01 08:39:54.633000000
2 2012-10-05 07:50:14.203000000
3 2012-10-02 15:02:42.843000000
4 2012-10-02 09:02:13
5 2012-10-02 09:02:13
6 2012-10-09 11:00:36
7 2012-10-09 11:00:36
某些'秒是整数,有些是浮点数,因此以下两种方法都会失败:
import datetime as dt
#Method 1: consider the float
data_test['DataTime'] = data_test['DataTime'].apply(lambda x: dt.datetime.strptime(x, '%Y-%m-%d %H:%M:%S.%f'))
#Method 2: ignore the float
data_test['DataTime'] = data_test['DataTime'].apply(lambda x: dt.datetime.strptime(x, '%Y-%m-%d %H:%M:%S'))
我可以将此列转换为DateTime吗?
答案 0 :(得分:1)
您可以使用to_datetime()方法:
In [222]: df
Out[222]:
DataTime
0 2012-10-03 12:14:18.257000000
1 2012-10-01 08:39:54.633000000
2 2012-10-05 07:50:14.203000000
3 2012-10-02 15:02:42.843000000
4 2012-10-02 09:02:13
5 2012-10-02 09:02:13
6 2012-10-09 11:00:36
7 2012-10-09 11:00:36
In [223]: df.dtypes
Out[223]:
DataTime object
dtype: object
In [224]: df.DataTime = pd.to_datetime(df.DataTime)
In [225]: df
Out[225]:
DataTime
0 2012-10-03 12:14:18.257
1 2012-10-01 08:39:54.633
2 2012-10-05 07:50:14.203
3 2012-10-02 15:02:42.843
4 2012-10-02 09:02:13.000
5 2012-10-02 09:02:13.000
6 2012-10-09 11:00:36.000
7 2012-10-09 11:00:36.000
In [226]: df.dtypes
Out[226]:
DataTime datetime64[ns]
dtype: object