Openpyxl支持将excel 2010工作簿的整个工作表转换为熊猫数据框。我想使用Excel的本机索引选择这些单元格的子集,然后将该单元格块转换为数据框。 Openpyxl的有关使用熊猫的文档无济于事:https://openpyxl.readthedocs.io/en/stable/pandas.html
我试图避免1)遍历数据中的所有行和列,因为这样效率低下; 2)而是在创建后从数据框中删除此单元格; 3)Pandas的read_excel模块,因为它似乎不支持在Excel的本机索引中指定范围。
#This converts an entire workbook to a pandas dataframe
import pandas as pd
import openpyxl as px
Work_Book = px.load_workbook(filename='MyBook.xlsx')
Work_Sheet = Work_Book['Sheet1']
df = pd.DataFrame(Work_Sheet.values)
#This produces a tuple of cells. Calling pd.DataFrame on it returns
#"ValueError: DataFrame constructor not properly called!"
Cell_Range = Work_Sheet['B2:D4']
#This is the only way I currently know to convert Cell_Range to a Pandas
# DataFrame. I'm trying to avoid these nested loops.
row_list = []
for row in Cell_Range:
col_list = []
for col in row:
col_list.append(col.value)
row_list.append(col_list)
df = pd.DataFrame(row_list)
我正在尝试找到将上面的Cell_Range对象转换为pandas数据帧的最有效方法。谢谢!