每个文件夹都有一个csv用于一年中的每个月(1.csv,2.csv,3.csv等),脚本会创建一个数据框,将所有12个csv的第9列合并到xlsx表中名为concentrated.xlsx
。它可以工作,但一次只能用于一个目录
files = glob['2014/*.csv']
sorted_files = natsorted(files)
def read_9th(fn):
return pd.read_csv(fn, usecols=[9], names=headers)
big_df = pd.concat([read_9th(fn) for fn in sorted_files], axis=1)
writer = pd.ExcelWriter('concentrated.xlsx', engine='openpyxl')
big_df.to_excel(writer,'2014')
writer.save()
是否可以自动为每个目录创建数据框,而无需为每个文件夹手动创建一个数据框,如下所示:
files14 = glob['2014/*.csv']
files15 = glob['2015/*.csv']
sorted_files14 = natsorted(files14)
sorted_files15 = natsorted(files15)
def read_9th(fn):
return pd.read_csv(fn, usecols=[9], names=headers)
big_df = pd.concat([read_9th(fn) for fn in sorted_files14], axis=1)
big_df1 = pd.concat([read_9th(fn) for fn in sorted_files15], axis=1)
writer = pd.ExcelWriter('concentrated.xlsx', engine='openpyxl')
big_df.to_excel(writer,'2014')
big_df1.to_excel(writer,'2015')
writer.save()
答案 0 :(得分:1)
如果您获得了要处理的文件夹列表,例如
folders = os.listdir('.')
# or
folders = ['2014', '2015', '2016']
您可以执行以下操作:
writer = pd.ExcelWriter('concentrated.xlsx', engine='openpyxl')
for folder in folders:
files = glob('%s/*.csv' % folder)
sorted_files = natsorted(files)
big_df = pd.concat([read_9th(fn) for fn in sorted_files], axis=1)
big_df.to_excel(writer, folder)
writer.save()