我正在尝试下载文件/阅读页面内容。该网址通过 Siteminder 身份验证进行身份验证。我正在使用下面的代码,但我正在
401:身份验证错误
import urllib
import urllib.parse
import urllib.request
import ssl
context = ssl.SSLContext(ssl.PROTOCOL_TLSv1)
context.verify_mode = ssl.CERT_REQUIRED
context.check_hostname = True
context.load_default_certs()
password_mgr = urllib.request.HTTPPasswordMgrWithDefaultRealm()
url = "https://myurl.com/blah..."
password_mgr.add_password(None, url, 'myuserID', 'myPassword')
handler = urllib.request.HTTPBasicAuthHandler(password_mgr)
opener = urllib.request.build_opener(handler)
urllib.request.install_opener(opener)
req = urllib.request.urlopen(url, context=context).read()
print(response.text)
我也在下面用requests
尝试过,但我得到同样的错误。
import requests
from requests_ntlm import HttpNtlmAuth
url = "https://myurl.com/blah..."
s = requests.Session()
s.mount('url', SSLAdapter(ssl.PROTOCOL_TLSv1))
response = s.get(url, params=None, verify=True, auth=HttpNtlmAuth('myuserID', 'myPassword'), timeout=None)
print(response.status_code)
print(response.text)
我正在使用Windows 10,使用Python 3.5。
但是如果我在UNIX系统中运行它,那么它可以工作,我可以下载文件:
wget --secure-protocol=TLSv1 --no-check-certificate --user=myuserID --password=myPassword 'https://myurl.com/blah...'
python代码有什么问题?
更新 如果我使用下面的代码使用cookie然后下载1个html文件:当我在IE中打开html文件然后再次询问凭据,然后下载确切的文件。
import urllib
import urllib.parse
import urllib.request
import http.cookiejar
url = "https://myurl.com/blah..."
username = 'myuserID'
password = 'myPassword'
cj = http.cookiejar.CookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
opener.addheaders = [('User-agent', 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)')]
params = urllib.parse.urlencode({'user': username, 'password': password}).encode("utf-8")
print(params)
opener = urllib.request.build_opener(
urllib.request.HTTPRedirectHandler(),
urllib.request.HTTPHandler(debuglevel=0),
urllib.request.HTTPSHandler(debuglevel=0),
urllib.request.HTTPCookieProcessor(cj))
urllib.request.install_opener(opener)
page = urllib.request.urlopen(url, params).read()
print(page)
html如下:
<HTML>
<HEAD>
<TITLE></TITLE>
</HEAD>
<BODY onLoad="document.AUTOSUBMIT.submit();">
This page is used to hold your data while you are being authorized for your request.<BR><BR>You will be forwarded to continue the authorization process. If this does not happen automatically, please click the Continue button below.
<FORM NAME="AUTOSUBMIT" METHOD="POST" ENCTYPE="application/x-www-form-urlencoded" ACTION="https://myurl.com/blah...">
<INPUT TYPE="HIDDEN" NAME="SMPostPreserve" VALUE="gPcBiwAFuJyNjkpY2Lq/i3Iq80qHAGVxfmNyp3VohPbmmfdNGf5bEhuAZXmqUWJ+ftkJ4uZvFjrnDf+sGAm13m5CDGTljCCF">
<INPUT TYPE="SUBMIT" VALUE="Continue">
</FORM>
</BODY>
</HTML>