使用python中的openpyxl读取存储在sharepoint位置的xlsx?

时间:2015-12-10 13:48:39

标签: python excel sharepoint openpyxl

快一点。

我的XLSX文件位于sharepoint驱动器上,无法在python中使用openpyxl打开它,如果它存储在我的本地驱动器上,它可以正常工作。

我试过了。

from openpyxl import load_workbook
wb = load_workbook('https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx')

引发此异常:

C:\Anaconda\lib\site-packages\openpyxl\reader\excel.py in load_workbook(filename, use_iterators, keep_vba, guess_types, data_only)
    123     except (BadZipfile, RuntimeError, IOError, ValueError):
    124         e = exc_info()[1]
--> 125         raise InvalidFileException(unicode(e))
    126     wb = Workbook(guess_types=guess_types, data_only=data_only)
    127 

InvalidFileException: [Errno 22] invalid mode ('rb') or filename: 'https://...

我错过了什么吗? 我需要阅读python中其中一个工作表的内容。

编辑:

使用crussell的建议,我收到 401 UNAUTHORIZED

import requests
import urllib
from openpyxl import load_workbook
from requests.auth import HTTPBasicAuth

file = "https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx"

username = 'PotatoUser'
password = 'PotatoPassword'

resp=requests.get(file, auth=HTTPBasicAuth(username, password))
print(resp.content)

似乎与sharepoint和请求不兼容,同时具有摘要式身份验证和基本身份验证 http://docs.python-requests.org/en/latest/user/authentication/

3 个答案:

答案 0 :(得分:1)

尝试使用urllib。

,而不是尝试直接从网址加载
import urllib
file = "https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx"
urllib.urlretrieve(file,"test.xlsx")

通过进一步的研究,urllib显然被requests所避免。试试这个:

import requests
from requests.auth import HTTPBasicAuth
file = "https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx"

username = 'myUsername'
password = 'myPassword'

resp=requests.get(file, auth=HTTPBasicAuth(username, password))
output = open('test.xlsx', 'wb')
output.write(resp.content)
output.close()

要安装请求:

pip install requests

答案 1 :(得分:0)

您可能首先需要先下载它,而不是直接打开它。以下方法应该有效:

import urllib2
from openpyxl import load_workbook
import StringIO

data = urllib2.urlopen("https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx")
xlsx = data.read()
wb = load_workbook(StringIO.StringIO(xlsx))

Python的StringIO可用于使下载的数据显示为文件界面。

答案 2 :(得分:0)

如果SP是内部的,则可以通过删除" https:"在您放入load_workbook()

的名称中
from openpyxl import load_workbook
file = '//content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx'
wb = load_workbook(file)

如果您的工作帐户直接连接到SP,则无需身份验证。否则,在我的工作中,我们使用NTML身份验证,您可以使用库HttpNtlmAuth中的request_ntml来执行此操作。

让我知道如果它有用或者您仍然对这个问题感兴趣,我可以举例说明request_ntml