将Excel表格导入熊猫数据框

时间:2019-01-10 10:39:33

标签: python excel pandas dataframe

我想将工作簿中的excel表(通过使用Excel 2007及以上制表功能制成)导入单独的数据框。抱歉,是否曾经有人问过我,但是从搜索中找不到我想要的东西。我知道您可以使用read_excel函数轻松完成此操作,但是可以使用requires the specification of a Sheetnamereturns a dict of dataframes for each sheet

我想知道是否有一种指定表名的方法,或者更好地为工作簿中的每个表返回一个数据帧的字典。

我知道可以通过combining xlwings with pandas完成此操作,但是我想知道是否已将其内置到任何熊猫函数中(可能是ExcelFile)。

类似这样的东西:-

import pandas as pd
xls = pd.ExcelFile('excel_file_path.xls')
# to read all tables to a map
tables_to_df_map = {}
for table_name in xls.table_names:
    table_to_df_map[table_name] = xls.parse(table_name)

1 个答案:

答案 0 :(得分:0)

尽管不完全是我想要的,但我找到了一种方法来获取表名,但要注意的是,表名仅限于工作表名称。

这是我当前正在使用的代码的摘录:

import pandas as pd
import openpyxl as op
wb=op.load_workbook(file_location) 
ws=wb['Sheet_Name']
#Intialising a list that will contain the sheet range of the tables in excel
table_ranges=[]
#Importing table details from excel: Table_Name and Sheet_Range
for table in ws._tables:
    table_ranges.append([table.name,table.ref])
table_ranges= pd.DataFrame(table_ranges, columns= ['Table_Name','Sheet_Range'])
#Initliasing an empty list where the excel table will be imported into
#This will create a lists of lists where each Excel table will be a sub list
xl_tables=[]
#Extracting each excel table found in the file
for index, rw in table_ranges.iterrows():
    sht_range=ws[rw['Sheet_Range']]
    data_rows = []
    i=0
    j=0
    for row in sht_range:
        j+=1
        data_cols = []
        for cell in row:
            i+=1
            data_cols.append(cell.value)
            if (i == len(row)) & (j == 1):
            #This if creates a column titled 'Table_Name' where each row is\
            #the name of the current table being imported from excel
                data_cols.append('Table_Name')
            elif i == len(row):
                data_cols.append(rw['Table_Name'])
        data_rows.append(data_cols)
        i=0
    var_tables.append(data_rows)

#Creating an empty dataframe where all the tables will be appended into
df=pd.DataFrame(columns=['Table_Name',...#Define as needed])
#Appending each table extracted from excel into the dataframe
for tb in xl_tables:
    df = pd.DataFrame(tb[1:], columns = tb[0])