我正在尝试使用Pandas来完成一些看似非常简单的事情,但却陷入困境。 我想将多个电子表格(具有多个工作表)合并到一个包含所有工作表的MasterSpreadSheet。
input example:
spreadsheet1 -> sheetname_a, sheetname_b, sheetname_c, sheetname_d
spreadsheet2 -> sheetname_a, sheetname_b, sheetname_c, sheetname_d
spreadsheet3 ......
output desired:
one single file with the data from all spreadsheets separated by the especific sheetname
MasterSpreadSheet -> sheetname_a, sheetname_b, sheetname_c, sheetname_d
以下是生成单个MasterSpreadSheet的代码,但它会覆盖以前的电子表格数据,而MasterFile只包含上一个电子表格中的数据:
with pd.ExcelWriter(outputfolder + '/' + country + '-MasterSheet.xlsx') as writer:
for spreadsheet in glob.glob(os.path.join(outputfolder, '*-Spreadsheet.xlsx')):
sheets = pd.ExcelFile(spreadsheet).sheet_names
for sheet in sheets:
df = pd.DataFrame()
sheetname = sheet.split('-')[-1]
data = pd.read_excel(spreadsheet, sheet)
data.index = [basename(spreadsheet)] * len(data)
df = df.append(data)
df.to_excel(writer, sheet_name = sheetname)
writer.save()
writer.close()
建议?
谢谢!
答案 0 :(得分:0)
现在有工作:)。已经循环并逐页附加,然后是电子表格文件,还在表格循环的末尾添加了pandas concat:
df1 = []
sheet_list = []
sheet_counter = 0
with pd.ExcelWriter(outputfolder + '/' + country + '-MasterSheet.xlsx') as writer:
for template in glob.glob( os.path.join(templatefolder, '*.textfsm') ):
template_name = template.split('\\')[-1].split('.textfsm')[0]
sheet_list.append(template_name) ## List of Sheets per Spreadsheet file
for sheet in sheet_list:
for spreadsheet in glob.glob(os.path.join(outputfolder, '*-Spreadsheet.xlsx')):
data = pd.read_excel(spreadsheet, sheet_counter)
data.index = [basename(spreadsheet)] * len(data)
df1.append(data)
df1 = pd.concat(df1)
df1.to_excel(writer, sheet)
df1 = []
sheet_counter += 1 ##Adding a counter to get the next Sheet of each Spreadsheet