我有一个循环读取文档中的Excel工作表。我想将它们全部存储在列表中:
DF_list= list()
for sheet in sheets:
df= pd.read_excel(...)
DF_list = DF_list.append(df)
如果我输入:
[df df df df]
它有效。
抱歉,我有一个Matlab背景,并不习惯Python,但我喜欢它。 感谢。
答案 0 :(得分:3)
试试这个
DF_list= list()
for sheet in sheets:
df = pd.read_excel(...)
DF_list.append(df)
或者对于更紧凑的python,这样的事情可能会做
DF_list=[pd.read_excel(...) for sheet in sheets]
答案 1 :(得分:3)
.append()
修改列表并返回None
。
您在第一个循环中使用DF_list
覆盖None
,并且第二个循环中的追加将失败。
因此:
DF_list = list()
for sheet in sheets:
DF_list.append(pd.read_excel(...))
或使用列表理解:
DF_list = [pd.read_excel(...) for sheet in sheets]
答案 2 :(得分:2)
完整的解决方案如下:
# (0) Variable containing location of excel file containing many sheets
excelfile_wt_many_sheets = 'C:\this\is\my\location\and\filename.xlsx'
# (1) Initiate empty list to hold all sheet specific dataframes
df_list= []
# (2) create unicode object 'sheets' to hold all sheet names in the excel file
df = pd.ExcelFile(excelfile_wt_many_sheets)
sheets = df.sheet_names
# (3) Iterate over the (2) to read in every sheet in the excel into a dataframe
# and append that dataframe into (1)
for sheet in sheets:
df = pd.read_excel(excelfile_wt_many_sheets, sheet)
df_list.append(df)
答案 3 :(得分:1)
如果您使用参数sheet_name=None
:
dfs = pd.read_excel(..., sheet_name=None)
它将返回Dataframes字典:
sheet_name : string, int, mixed list of strings/ints, or None, default 0
Strings are used for sheet names, Integers are used in zero-indexed
sheet positions.
Lists of strings/integers are used to request multiple sheets.
Specify None to get all sheets.
str|int -> DataFrame is returned.
list|None -> Dict of DataFrames is returned, with keys representing
sheets.
Available Cases
* Defaults to 0 -> 1st sheet as a DataFrame
* 1 -> 2nd sheet as a DataFrame
* "Sheet1" -> 1st sheet as a DataFrame
* [0,1,"Sheet5"] -> 1st, 2nd & 5th sheet as a dictionary of DataFrames
* None -> All sheets as a dictionary of DataFrames
答案 4 :(得分:0)
实际上,不需要定义新列表来存储一堆数据帧。在具有多张工作表的excel文件上应用的pandas.ExcelFile函数返回ExcelFile对象,该对象是一个可以将一组数据帧捕获在一起的集合。希望下面的代码有所帮助。
df = pd.ExcelFile('C:\read_excel_file_with_multiple_sheets.xlsx')
Sheet_names_list = df.sheet_names
for sheet in Sheet_names_list :
df_to_print = df.parse(sheet_name=sheet)
print df_to_print