我目前有一个数据框,其中有一列包含日期时间值作为对象数据类型。
col1 col2 col3
0 A 10 2016-06-05 11:00:00
0 B 11 2016-06-04 00:00:00
0 C 12 2016-06-02 05:00:00
0 D 13 2016-06-03 02:00:00
我想做的是将col3转换为日期时间值,这样它就可以给我:
Year-Month-Day-Hour
稍后再进行一些日期时间功能设计。当我尝试时:
df['col3'] = pd.to_datetime(df['col3'])
我收到此错误:
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 3008-07-25 00:00:00
有什么想法吗?
谢谢
答案 0 :(得分:3)
您可以使用参数errors='coerce'
将超出限制的值转换为NaT
:
print (df)
col1 col2 col3
0 A 10 2016-06-05 11:00:00
0 B 11 2016-06-04 00:00:00
0 C 12 2016-06-02 05:00:00
0 D 13 3008-07-25 00:00:00
df['col3'] = pd.to_datetime(df['col3'], errors='coerce')
print (df)
col1 col2 col3
0 A 10 2016-06-05 11:00:00
0 B 11 2016-06-04 00:00:00
0 C 12 2016-06-02 05:00:00
0 D 13 NaT
In [68]: pd.Timestamp.min
Out[68]: Timestamp('1677-09-21 00:12:43.145225')
In [69]: pd.Timestamp.max
Out[69]: Timestamp('2262-04-11 23:47:16.854775807')
也可以创建Periods,但是从字符串中创建起来并不容易:
def conv(x):
return pd.Period(year = int(x[:4]),
month = int(x[5:7]),
day = int(x[8:10]),
hour = int(x[11:13]), freq='H')
df['col3'] = df['col3'].apply(conv)
print (df)
col1 col2 col3
0 A 10 2016-06-05 11:00
0 B 11 2016-06-04 00:00
0 C 12 2016-06-02 05:00
0 D 13 3008-07-25 00:00