将数据框的日期列转换为日期类型

时间:2020-02-01 19:30:00

标签: python pandas dataframe datetime

我有以下数据框:

      import pandas as pd
      from datetime import datetime

      df = pd.DataFrame({'Id_sensor': [1, 2, 3, 4], 
                         'Date_start': ['2018-01-04 00:00:00.0', '2018-01-04 00:00:10.0',
                                        '2018-01-04 00:14:00.0', '2018-01-04'],
                         'Date_end': ['2018-01-05', '2018-01-06', '2017-01-06', '2018-01-05']})

列(Date_start和Date_end)的类型为Object。我想转换为日期的数据类型。并使列看起来相同。也就是说,用不包含列(Date_end)的零填充日期,小时和分钟字段。

我尝试编写以下代码:

      df['Date_start'] = pd.to_datetime(df['Date_start'], format='%Y/%m/%d %H:%M:%S')
      df['Date_end'] = pd.to_datetime(df['Date_end'], format='%Y/%m/%d %H:%M:%S')

我的输出:

        Id_sensor     Date_start         Date_end
           1       2018-01-04 00:00:00  2018-01-05
           2       2018-01-04 00:00:10  2018-01-06
           3       2018-01-04 00:14:00  2017-01-06
           4       2018-01-04 00:00:00  2018-01-05

但是我希望输出如下:

           Id_sensor      Date_start         Date_end
           1       2018-01-04 00:00:00    2018-01-05 00:00:00
           2       2018-01-04 00:00:10    2018-01-06 00:00:00
           3       2018-01-04 00:14:00    2017-01-06 00:00:00
           4       2018-01-04 00:00:00    2018-01-05 00:00:00

2 个答案:

答案 0 :(得分:1)

实际上发生的是df['Date_start']df['Date_end']系列都是 datetime64 [ns] 类型,但是当您显示数据帧时,如果所有时间值列为零,但不显示它们。如果需要格式化输出,可以尝试将它们再次转换为对象类型,并使用dt.strftime赋予它们格式:

df['Date_start'] = pd.to_datetime(df['Date_start']).dt.strftime('%Y/%m/%d %H:%M:%S')
df['Date_end'] = pd.to_datetime(df['Date_end']).dt.strftime('%Y/%m/%d %H:%M:%S')
print (df)

输出:

   Id_sensor           Date_start             Date_end
0          1  2018/01/04 00:00:00  2018/01/05 00:00:00
1          2  2018/01/04 00:00:10  2018/01/06 00:00:00
2          3  2018/01/04 00:14:00  2017/01/06 00:00:00
3          4  2018/01/04 00:00:00  2018/01/05 00:00:00

答案 1 :(得分:1)

您可以先使用to_datetime将列转换为datetime数据类型,然后使用dt.strftime将列转换为具有所需格式的字符串数据类型:

import pandas as pd
from datetime import datetime

df = pd.DataFrame({
    'Id_sensor': [1, 2, 3, 4], 
    'Date_start': ['2018-01-04 00:00:00.0', '2018-01-04 00:00:10.0',
                   '2018-01-04 00:14:00.0', '2018-01-04'],
    'Date_end': ['2018-01-05', '2018-01-06', '2017-01-06', '2018-01-05']})

df['Date_start'] = pd.to_datetime(df['Date_start']).dt.strftime('%Y-%m-%d %H:%M:%S')
df['Date_end'] = pd.to_datetime(df['Date_end']).dt.strftime('%Y-%m-%d %H:%M:%S')

print(df)
# Output:
#
#    Id_sensor           Date_start             Date_end
# 0          1  2018-01-04 00:00:00  2018-01-05 00:00:00
# 1          2  2018-01-04 00:00:10  2018-01-06 00:00:00
# 2          3  2018-01-04 00:14:00  2017-01-06 00:00:00
# 3          4  2018-01-04 00:00:00  2018-01-05 00:00:00