我有一个csv文件,其中包含一个具有多种日期格式的列。我需要拆分它们并以相同的格式获取提取的结果。
Wednesday 12 August 2015
Wednesday 12 August 2015
Friday April 1 2016
Friday April 1 2016
5/12/2016
5/12/2016
这是文件,我希望它以mm / dd / yy格式。我的代码如下:
import re
import csv
import pandas as pd
#delimiters = " ", "/"
#f = open('merged_34.csv')
f = open('test3.csv')
df = pd.read_csv('test3.csv')
for item in df['serverDatePrettyFirstAction']:
if '/' in item:
newDate.append(item)
else:
item = item.split(' ', 1)[1]
newDate.append(item)
df['newDate'] = newDate
df.to_csv('D:/Python/10.36.202.64/newfile.csv', index = False)
这就是我得到的:
serverDatePrettyFirstAction newDate
Wednesday 12 August 2015 12-Aug-15
Wednesday 12 August 2015 12-Aug-15
Friday April 1 2016 April 1 2016
Friday April 1 2016 April 1 2016
5/12/2016 5/12/2016
5/12/2016 5/12/2016
还有一种方法可以覆盖同一列本身的值
答案 0 :(得分:1)
只要您的数据不是太大,您就可以使用第三方dateutil库。(毕竟,它每次都会猜测格式)
import pandas as pd
from dateutil import parser
df = pd.read_csv('test3.csv')
df['newDate'] = df['serverDatePrettyFirstAction'].apply(parser.parse)
df.to_csv('newfile.csv', index=False, date_format='%Y-%m-%d ')
覆盖同一列中的值
使用
df['serverDatePrettyFirstAction']=df['serverDatePrettyFirstAction'].apply(parser.parse)
答案 1 :(得分:1)
更快的方法是使用pandas的方法to_datetime():
In [2]: df
Out[2]:
Date
0 Wednesday 12 August 2015
1 Wednesday 12 August 2015
2 Friday April 1 2016
3 Friday April 1 2016
4 5/12/2016
5 5/12/2016
In [6]: df['newDate'] = pd.to_datetime(df['Date'])
结果:
In [7]: df
Out[7]:
Date newDate
0 Wednesday 12 August 2015 2015-08-12
1 Wednesday 12 August 2015 2015-08-12
2 Friday April 1 2016 2016-04-01
3 Friday April 1 2016 2016-04-01
4 5/12/2016 2016-05-12
5 5/12/2016 2016-05-12