Question

我有一个包含所有列（标题）的所有字段数据框。

我必须阅读一本有25张的excel工作簿，但我只需阅读10张。所以我在嵌套for循环中传递文件列表来读取工作表和工作表的名称就像＆＃34;工作M1＆＃34;，＆＃34;旅行M1＆＃34;，＆＃34;装备M1＆＃34; .....＆＃34;劳工M2＆＃34;，＆＃34;旅行M2＆＃34;等。

我有一个主循环，它定义了工作表名称末尾的数字变量。

所以我正在读取所有最后一张纸并将其保存在列表中，然后附加到Dataframe1，然后添加一个名为＆＃34的列; Milestone＆＃34;这将具有＆＃34; i＆＃34;的价值在main for循环中然后再次循环并读取最后有2个数字然后相同过程的工作表。

所以最后我要追加＆＃34; Dataframe1＆＃34;到＃34; Allfields＆＃34;数据帧，然后我正在保存＆＃34; Allfields＆＃34;完成循环后，数据帧将excel文件。

因此，当我打开excel文件时，它正确地附加了数据，但问题是我添加的列需要总是取最后一个值＆＃34; i＆＃34;在for循环中但它应该取我正在阅读的数字的值。

这是我的代码

for i in range(1, 3):
    list_sheets = ['Labour M' + str(i), 'Travel M' + str(i), 'Equip M' + str(i), 'Consult-SubC M' + str(i),
                   'Other M' + str(i)]
    Empty_List = []

    All_Milestone = pd.DataFrame()
    for p in list_sheets:

        print(p)
        book_open = open_workbook(getpath())

        sheet_open = book_open.sheet_by_name(p)

        print("Successfully found the sheet")

        for rowidx in range(sheet_open.nrows):
            row = sheet_open.row(rowidx)
            for colidx, cell in enumerate(row):
                if cell.value == "Consortium Member":
                    print("Sheet Name:", sheet_open.name)
                    print("Row Number:", rowidx)
                    value = int(rowidx)
        else:
            print("Sheet ", sheet_open, "not found.")

        reading_book = pd.read_excel(getpath(), sheet_name=p, skiprows=value)
        sheet = reading_book.dropna(axis=0, how='any', subset=['Consortium Member'])
        sheet.drop([col for col in sheet.columns if "Unnamed" in col], axis=1, inplace=True)

        print("Successfully read the sheet" + p + "\n")

        Empty_List.append(sheet)

        print("Successfully appended the sheet" + p + "\n")

        Dataframe1 = pd.DataFrame()
        Dataframe1 = pd.concat(Empty_List)
        Dataframe1['Milestone'] = i
        Dataframe1.reset_index(inplace=False)

        write = ExcelWriter("inter"+str(i)+".xlsx")
        Dataframe1.to_excel(write, 'Sheet1', index=False)
        write.save()


        Dataframe_Temp = Allfields.append(Dataframe1)
        All_fields_Total_Columns = len(Allfields.columns)
        Dataframe_Total_columns = len(Dataframe_Temp.columns)

        if Dataframe_Total_columns == All_fields_Total_Columns:
            Milestone_Dataframe = Allfields.append(Dataframe1)

        else:
            print("Total number of columns are not same")
    All_Milestone = All_Milestone.append(Milestone_Dataframe)

write = ExcelWriter('Final.xlsx')
All_Milestone.to_excel(write, 'Sheet1', index=False)
write.save()

那么请你告诉我如何准确保存里程碑栏的价值。

Python：阅读多个Excel工作表，添加列并将其附加到一个数据框

0 个答案: