无法使用pd.to_datetime转换为datetime

时间:2014-10-28 16:07:23

标签: python datetime csv pandas

我正在尝试读取csv文件并将其转换为数据帧以用作时间序列。 csv文件属于这种类型:

         #Date      Time    CO_T1_AHU.01_CC_CTRV_CHW__SIG_STAT
0          NaN       NaN                                     %   
1          NaN       NaN  Cooling Coil Hydronic Valve Position   
2   2014-01-01  00:00:00                                     0   
3   2014-01-01  01:00:00                                     0   
4   2014-01-01  02:00:00                                     0   
5   2014-01-01  03:00:00                                     0   
6   2014-01-01  04:00:00                                     0

我使用以下方式阅读文件:

df = pd.read_csv ('filepath/file.csv', sep=';', parse_dates = [[0,1]])

产生这个结果:

             #Date_Time   FCO_T1_AHU.01_CC_CTRV_CHW__SIG_STAT
0               nan nan                                     %   
1               nan nan  Cooling Coil Hydronic Valve Position   
2   2014-01-01 00:00:00                                     0   
3   2014-01-01 01:00:00                                     0   
4   2014-01-01 02:00:00                                     0   
5   2014-01-01 03:00:00                                     0   
6   2014-01-01 04:00:00                                     0

继续将字符串转换为datetime并将其用作索引:

pd.to_datetime(df.values[:,0])
df.set_index([df.columns[0]], inplace=True)

所以我明白了:

                      FCO_T1_AHU.01_CC_CTRV_CHW__SIG_STAT
#Date_Time                                                  
nan nan                                                 %   
nan nan              Cooling Coil Hydronic Valve Position   
2014-01-01 00:00:00                                     0   
2014-01-01 01:00:00                                     0   
2014-01-01 02:00:00                                     0   
2014-01-01 03:00:00                                     0   
2014-01-01 04:00:00                                     0 

但是,pd.to_datetime无法转换为datetime。有没有办法找出错误是什么?

非常感谢。 路易斯

1 个答案:

答案 0 :(得分:1)

字符串条目' nan nan'无法使用to_datetime进行转换,因此请使用空字符串替换它们,以便现在可以将它们转换为NaT

In [122]:

df['Date_Time'].replace('nan nan', '',inplace=True)
df
Out[122]:
             Date_Time  index       CO_T1_AHU.01_CC_CTRV_CHW__SIG_STAT
0                           0                                     %   
1                           1  Cooling Coil Hydronic Valve Position   
2  2014-01-01 00:00:00      2                                     0   
3  2014-01-01 01:00:00      3                                     0   
4  2014-01-01 02:00:00      4                                     0   
5  2014-01-01 03:00:00      5                                     0   
6  2014-01-01 04:00:00      6                                        0
In [124]:

df['Date_Time'] = pd.to_datetime(df['Date_Time'])
df

Out[124]:
            Date_Time  index       CO_T1_AHU.01_CC_CTRV_CHW__SIG_STAT
0                 NaT      0                                     %   
1                 NaT      1  Cooling Coil Hydronic Valve Position   
2 2014-01-01 00:00:00      2                                     0   
3 2014-01-01 01:00:00      3                                     0   
4 2014-01-01 02:00:00      4                                     0   
5 2014-01-01 03:00:00      5                                     0   
6 2014-01-01 04:00:00      6                                        0

<强>更新

实际上,如果您只是设置coerce=True,那么转换正常:

df['Date_Time'] = pd.to_datetime(df['Date_Time'], coerce=True)