我有一个DataFrame列,其中包含日期(两种格式),我想将其重新编码为1种格式的日期时间。
列值如下:
0 2011-11-23 16:13:50
1 2016-02-06
2 2011-11-27
3 2014-04-17 22:41:08
4 2013-12-11 17:08:20
5 2011-08-13
6 2007-07-25
7 2009-03-17 15:55:59
8 2017-08-25
&等等
我想通过以下命令执行此操作:
df['Date'] = df['Date'].apply(lambda x: pd.to_datetime(x[0]))
错误:
Traceback (most recent call last):
File "/Users/stevengerrits/miniconda3/envs/py35thesis/lib/python3.5/site-packages/pandas/core/tools/datetimes.py", line 377, in _convert_listlike
values, tz = conversion.datetime_to_datetime64(arg)
File "pandas/_libs/tslibs/conversion.pyx", line 188, in pandas._libs.tslibs.conversion.datetime_to_datetime64
TypeError: Unrecognized value type: <class 'str'>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/stevengerrits/miniconda3/envs/py35thesis/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2961, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-16-e0bd36ee24b7>", line 1, in <module>
df['Date'] = df['Date'].apply(lambda x: pd.to_datetime(x[0]))
File "/Users/stevengerrits/miniconda3/envs/py35thesis/lib/python3.5/site-packages/pandas/core/series.py", line 3194, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas/_libs/src/inference.pyx", line 1472, in pandas._libs.lib.map_infer
File "<ipython-input-16-e0bd36ee24b7>", line 1, in <lambda>
df['Date'] = df['Date'].apply(lambda x: pd.to_datetime(x[0]))
File "/Users/stevengerrits/miniconda3/envs/py35thesis/lib/python3.5/site-packages/pandas/core/tools/datetimes.py", line 469, in to_datetime
result = _convert_listlike(np.array([arg]), box, format)[0]
File "/Users/stevengerrits/miniconda3/envs/py35thesis/lib/python3.5/site-packages/pandas/core/tools/datetimes.py", line 380, in _convert_listlike
raise e
File "/Users/stevengerrits/miniconda3/envs/py35thesis/lib/python3.5/site-packages/pandas/core/tools/datetimes.py", line 368, in _convert_listlike
require_iso8601=require_iso8601
File "pandas/_libs/tslib.pyx", line 492, in pandas._libs.tslib.array_to_datetime
File "pandas/_libs/tslib.pyx", line 739, in pandas._libs.tslib.array_to_datetime
File "pandas/_libs/tslib.pyx", line 733, in pandas._libs.tslib.array_to_datetime
答案 0 :(得分:1)
首先尝试将to_datetime
与errors='coerce'
一起用于将不可解析的值转换为NaT
:
df['Date'] = pd.to_datetime(df['Date'], errors='coerce')
print (df)
Date
0 2011-11-23 16:13:50
1 2016-02-06 00:00:00
2 2011-11-27 00:00:00
3 2014-04-17 22:41:08
4 2013-12-11 17:08:20
5 2011-08-13 00:00:00
6 2007-07-25 00:00:00
7 2009-03-17 15:55:59
8 2017-08-25 00:00:00
如果无法正常工作,请使用errors='coerce'
指定多种格式,并通过Series.combine_first
链接在一起,以用另一个Series
替换缺少的值:
date1 = pd.to_datetime(df['Date'],format='%Y-%m-%d %H:%M:%S', errors='coerce')
date2 = pd.to_datetime(df['Date'],format='%Y-%m-%d', errors='coerce')
df['Date'] = date1.combine_first(date2)
print (df)
Date
0 2011-11-23 16:13:50
1 2016-02-06 00:00:00
2 2011-11-27 00:00:00
3 2014-04-17 22:41:08
4 2013-12-11 17:08:20
5 2011-08-13 00:00:00
6 2007-07-25 00:00:00
7 2009-03-17 15:55:59
8 2017-08-25 00:00:00