我正在尝试使用pandas库来处理许多具有相同列的excel文件。
这是我的职责;
def read_dipsfile(writer):
atg_path = '/Users/ratha/PycharmProjects/DataLoader/data/dips'
files = os.listdir(atg_path)
df = pd.DataFrame()
dateCol = ['Dip Time']
for f in files:
if(f.endswith('.CSV')):
data = pd.read_csv(os.path.join(atg_path, f), delimiter=',', skiprows=[1], skipinitialspace=True,
parse_dates=dateCol)
data['Dip Time'] = pd.to_datetime(data['Dip Time'])
if mid_day_check(data['Dip Time']):
data['Dip Time'] = data['Dip Time'].dt.normalize()
data['Dip Time'] = data['Dip Time'] .dt.strftime('%d/%m/%Y')
df = df.append(data)
x = df.groupby(['Dip Time','Site', 'Tank ID','Product','Volume'], as_index=False).apply(atg_aggregation)
x.to_excel(writer, sheet_name='DipsSummary')
在这里,我希望我的“浸入时间”列将日期显示为“ 01-09-2019”(date-month-year) 但是我终于得到了(月-年-年格式)09/01/2019? 我在这里做错了什么?
我的原始文件日期格式类似于“ 17/09/2019”格式。在阅读日期列时,我应该定义日期格式是什么?
答案 0 :(得分:0)
更改to_datetime
行,添加dayfirst
data['Dip Time'] = pd.to_datetime(data['Dip Time'],dayfirst=True)