Question

我正在读取一个excel文件并将其合并为csv文件。

当我阅读excel文件时，我有一个日期字段：

0    2018-05-28 00:00:00
1    9999-12-31 00:00:00
2    2018-02-26 00:00:00
3    2018-02-26 00:00:00
4    2018-02-26 00:00:00
Name: Date_started, dtype: object

我检查数据类型

df['Date_started'].dtype
dtype('O')

然后，当我将结果数据帧写到csv时，我得到了：

df.to_csv(folderpath + "Date_Started_df.csv",encoding="UTF-8" , index=False, na_rep='',date_format='%d%m%Y')
Date_Started

28/05/2018 00:00
31/12/9999 00:00
26/02/2018 00:00
26/02/2018 00:00
26/02/2018 00:00

我尝试过

df.loc[:,'Date_Started'] = df['Date_Started'].astype('str').str[8:10] + "/" + 
df['Date_Started'].astype('str').str[5:7] + "/" + 
df['Date_Started'].astype('str').str[:4]

哪个给了我

0    28/05/2018
1    31/12/9999
2    26/02/2018
3    26/02/2018
4    26/02/2018
Name: Date_started, dtype: object

我认为可能是在写出来：

df.to_csv(filename, date_format='%Y%m%d')

但是我仍然有时间！？

Answer 1

在发送到CSV之前，您需要将系列转换为datetime：

df['Date_Started'] = pd.to_datetime(df['Date_Started'])

然后，这将使Pandas使用date_format='%d%m%Y'为相应的列执行to_csv。 to_csv docs明确说明了这一点：

日期格式：字符串，默认为无

日期时间对象的格式字符串

使用Python从Excel文件中无时间写出日期

1 个答案: