为什么我不能将pandas.DatetimeIndex应用于多个列?

时间:2016-08-18 15:36:09

标签: python pandas datetimeindex

我正在尝试使用以下代码删除几个pandas列上的时间部分:

group_df['submitted_on'] = pd.DatetimeIndex(group_df['submitted_on']).to_period('d')
group_df['resolved_on'] = pd.DatetimeIndex(group_df['resolved_on']).to_period('d')

这适用于第一列,但我似乎无法弄清楚为什么我不能将它应用于多列。

尝试执行第二行时出现以下错误:

  File "C:/Users/anshanno/PycharmProjects/RETIvizScript/RetiViz.py", line 271, in join_groups
    group_df['resolved_on'] = pd.DatetimeIndex(group_df['resolved_on']).to_period('d')
  File "C:\Python27\lib\site-packages\pandas\util\decorators.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "C:\Python27\lib\site-packages\pandas\tseries\index.py", line 349, in __new__
    values, freq=freq, dayfirst=dayfirst, yearfirst=yearfirst)
  File "pandas\tslib.pyx", line 2347, in pandas.tslib.parse_str_array_to_datetime (pandas\tslib.c:42450)
ValueError

由于ValueError没有告诉我任何事情,我试了errors='coerce'没有任何运气 - 我仍然得到同样的不合规定的错误。

group_df['resolved_on'] = pd.DatetimeIndex(group_df['resolved_on'], errors='coerce').to_period('d')

编辑(样本数据):

"identifier","status","submitted_on","resolved_on","closed_on","duplicate_on","junked_on","unproducible_on","verified_on"
"xx1","D","2004-07-28 07:00:00.0","null","null","2004-08-26 07:00:00.0","null","null","null"
"xx2","N","2010-03-02 03:00:16.0","null","null","null","null","null","null"
"xx3","U","2005-10-26 14:20:20.0","null","null","null","null","2005-11-01 13:02:22.0","null"
"xx4","V","2006-06-30 07:00:00.0","2006-09-15 07:00:00.0","null","null","null","null","2006-11-20 08:00:00.0"
"xx5","R","2012-09-21 06:30:58.0","2013-06-06 09:35:25.0","null","null","null","null","null"
"xx6","D","2009-11-25 02:16:03.0","null","null","2010-02-26 12:28:22.0","null","null","null"
"xx7","D","2003-08-29 07:00:00.0","null","null","2003-08-29 07:00:00.0","null","null","null"
"xx8","R","2003-06-06 12:00:00.0","2003-06-24 12:00:00.0","null","null","null","null","null"
"xx9","R","2004-11-05 08:00:00.0","2004-11-15 08:00:00.0","null","null","null","null","null"
"xx10","R","2008-02-21 05:13:39.0","2008-09-25 17:20:57.0","null","null","null","null","null"
"xx11","R","2007-03-08 17:47:44.0","2007-03-21 23:47:57.0","null","null","null","null","null"
"xx12","R","2011-08-22 19:50:25.0","2012-06-21 05:52:12.0","null","null","null","null","null"
"xx13","J","2003-07-07 12:00:00.0","null","null","null","2003-07-10 12:00:00.0","null","null"
"xx14","A","2008-09-24 11:36:34.0","null","null","null","null","null","null"

谢谢你们,感谢任何帮助。

1 个答案:

答案 0 :(得分:2)

使用pd.to_datetime代替pd.DatetimeIndex

group_df['submitted_on'] = pd.to_datetime(group_df['submitted_on'], 'coerce').dt.to_period('d')
group_df['resolved_on'] = pd.to_datetime(group_df['resolved_on'], 'coerce').dt.to_period('d')

group_df

enter image description here