Question

我可以使用以下命令打开受密码保护的Excel文件：

import sys
import win32com.client
xlApp = win32com.client.Dispatch("Excel.Application")
print "Excel library version:", xlApp.Version
filename, password = sys.argv[1:3]
xlwb = xlApp.Workbooks.Open(filename, Password=password)
# xlwb = xlApp.Workbooks.Open(filename)
xlws = xlwb.Sheets(1) # counts from 1, not from 0
print xlws.Name
print xlws.Cells(1, 1) # that's A1

我不确定如何将信息传输到pandas数据帧。我是否需要逐个读取单元格，或者是否有方便的方法来实现？

Answer 1

假设起始单元格为（StartRow，StartCol），结束单元格为（EndRow，EndCol），我发现以下内容对我有用：

# Get the content in the rectangular selection region
# content is a tuple of tuples
content = xlws.Range(xlws.Cells(StartRow, StartCol), xlws.Cells(EndRow, EndCol)).Value 

# Transfer content to pandas dataframe
dataframe = pandas.DataFrame(list(content))

注意：Excel Cell B5在win32com中作为第5行，第2列给出。此外，我们需要list（...）从元组元组转换为元组列表，因为元组元组没有pandas.DataFrame构造函数。

Answer 2

假设您可以使用win32com API将加密文件保存回磁盘（我意识到可能会失败），您可以立即调用顶级pandas函数read_excel。您需要首先安装xlrd（适用于Excel 2003），xlwt（也适用于2003）和openpyxl（适用于Excel 2007）的某些组合。 Here是用于读取Excel文件的文档。目前，pandas不支持使用win32com API读取Excel文件。如果您愿意，欢迎open up a GitHub issue。

Answer 3

来自David Hamann的网站（所有积分都归他所有） https://davidhamann.de/2018/02/21/read-password-protected-excel-files-into-pandas-dataframe/

使用xlwings，打开文件将首先启动Excel应用程序，以便输入密码。

import pandas as pd
import xlwings as xw

PATH = '/Users/me/Desktop/xlwings_sample.xlsx'
wb = xw.Book(PATH)
sheet = wb.sheets['sample']

df = sheet['A1:C4'].options(pd.DataFrame, index=False, header=True).value
df

Answer 4

根据@ikeoddy提供的建议，应该将各个部分放在一起：

How to open a password protected excel file using python?

# Import modules
import pandas as pd
import win32com.client
import os
import getpass

# Name file variables
file_path = r'your_file_path'
file_name = r'your_file_name.extension'

full_name = os.path.join(file_path, file_name)
# print(full_name)

Getting command-line password input in Python

# You are prompted to provide the password to open the file
xl_app = win32com.client.Dispatch('Excel.Application')
pwd = getpass.getpass('Enter file password: ')

Workbooks.Open Method (Excel)

xl_wb = xl_app.Workbooks.Open(full_name, False, True, None, pwd)
xl_app.Visible = False
xl_sh = xl_wb.Worksheets('your_sheet_name')

# Get last_row
row_num = 0
cell_val = ''
while cell_val != None:
    row_num += 1
    cell_val = xl_sh.Cells(row_num, 1).Value
    # print(row_num, '|', cell_val, type(cell_val))
last_row = row_num - 1
# print(last_row)

# Get last_column
col_num = 0
cell_val = ''
while cell_val != None:
    col_num += 1
    cell_val = xl_sh.Cells(1, col_num).Value
    # print(col_num, '|', cell_val, type(cell_val))
last_col = col_num - 1
# print(last_col)

ikeoddy的答案：

content = xl_sh.Range(xl_sh.Cells(1, 1), xl_sh.Cells(last_row, last_col)).Value
# list(content)
df = pd.DataFrame(list(content[1:]), columns=content[0])
df.head()

python win32 COM closing excel workbook

xl_wb.Close(False)

Answer 5

添加到@Maurice 答案以获取工作表中的所有单元格而无需指定范围

wb = xw.Book(PATH, password='somestring')
sheet = wb.sheets[0] #get first sheet

#sheet.used_range.address returns string of used range
df = sheet[sheet.used_range.address].options(pd.DataFrame, index=False, header=True).value

从受密码保护的Excel文件到pandas DataFrame

5 个答案: