Python Pandas - 将包含多个工作表的多个电子表格合并到包含所有工作表的单个MasterSpreadsheet中

时间:2017-10-25 08:12:39

标签: python excel pandas

我正在尝试使用Pandas来完成一些看似非常简单的事情,但却陷入困境。 我想将多个电子表格(具有多个工作表)合并到一个包含所有工作表的MasterSpreadSheet。

input example:
spreadsheet1 -> sheetname_a, sheetname_b, sheetname_c, sheetname_d
spreadsheet2 -> sheetname_a, sheetname_b, sheetname_c, sheetname_d
spreadsheet3 ......

output desired:
one single file with the data from all spreadsheets separated by the especific sheetname
MasterSpreadSheet -> sheetname_a, sheetname_b, sheetname_c, sheetname_d

以下是生成单个MasterSpreadSheet的代码,但它会覆盖以前的电子表格数据,而MasterFile只包含上一个电子表格中的数据:

with pd.ExcelWriter(outputfolder + '/' + country + '-MasterSheet.xlsx') as writer:

    for spreadsheet in glob.glob(os.path.join(outputfolder, '*-Spreadsheet.xlsx')):
            sheets = pd.ExcelFile(spreadsheet).sheet_names
            for sheet in sheets:
                df = pd.DataFrame()
                sheetname = sheet.split('-')[-1]
                data = pd.read_excel(spreadsheet, sheet)
                data.index = [basename(spreadsheet)] * len(data)
                df = df.append(data)
                df.to_excel(writer, sheet_name = sheetname) 

            writer.save()   
            writer.close()

建议?

谢谢!

1 个答案:

答案 0 :(得分:0)

现在有工作:)。已经循环并逐页附加,然后是电子表格文件,还在表格循环的末尾添加了pandas concat:

df1 = []
sheet_list = []
sheet_counter = 0
with pd.ExcelWriter(outputfolder + '/' + country + '-MasterSheet.xlsx') as writer:

    for template in glob.glob( os.path.join(templatefolder, '*.textfsm') ):
        template_name = template.split('\\')[-1].split('.textfsm')[0] 
        sheet_list.append(template_name)  ## List of Sheets per Spreadsheet file

    for sheet in sheet_list: 
        for spreadsheet in glob.glob(os.path.join(outputfolder, '*-Spreadsheet.xlsx')):
            data = pd.read_excel(spreadsheet, sheet_counter)
            data.index = [basename(spreadsheet)] * len(data)
            df1.append(data)
        df1 = pd.concat(df1)    
        df1.to_excel(writer, sheet)
        df1 = []
        sheet_counter += 1  ##Adding a counter to get the next Sheet of each Spreadsheet