Question

我正在尝试下载文件/阅读页面内容。该网址通过 Siteminder 身份验证进行身份验证。我正在使用下面的代码，但我正在

401：身份验证错误

import urllib
import urllib.parse
import urllib.request
import ssl

context = ssl.SSLContext(ssl.PROTOCOL_TLSv1)
context.verify_mode = ssl.CERT_REQUIRED
context.check_hostname = True
context.load_default_certs()

password_mgr = urllib.request.HTTPPasswordMgrWithDefaultRealm()
url = "https://myurl.com/blah..."
password_mgr.add_password(None, url, 'myuserID', 'myPassword')
handler = urllib.request.HTTPBasicAuthHandler(password_mgr)
opener = urllib.request.build_opener(handler)
urllib.request.install_opener(opener)
req = urllib.request.urlopen(url, context=context).read()
print(response.text)

我也在下面用requests尝试过，但我得到同样的错误。

import requests
from requests_ntlm import HttpNtlmAuth

url = "https://myurl.com/blah..."
s = requests.Session()
s.mount('url', SSLAdapter(ssl.PROTOCOL_TLSv1))
response = s.get(url, params=None, verify=True, auth=HttpNtlmAuth('myuserID', 'myPassword'), timeout=None)
print(response.status_code)
print(response.text)

我正在使用Windows 10，使用Python 3.5。

但是如果我在UNIX系统中运行它，那么它可以工作，我可以下载文件：

wget --secure-protocol=TLSv1 --no-check-certificate --user=myuserID --password=myPassword 'https://myurl.com/blah...'

python代码有什么问题？

更新如果我使用下面的代码使用cookie然后下载1个html文件：当我在IE中打开html文件然后再次询问凭据，然后下载确切的文件。

import urllib
import urllib.parse
import urllib.request
import http.cookiejar

url = "https://myurl.com/blah..."
username = 'myuserID'
password = 'myPassword'

cj = http.cookiejar.CookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
opener.addheaders = [('User-agent', 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)')]
params = urllib.parse.urlencode({'user': username, 'password': password}).encode("utf-8")
print(params)
opener = urllib.request.build_opener(
    urllib.request.HTTPRedirectHandler(),
    urllib.request.HTTPHandler(debuglevel=0),
    urllib.request.HTTPSHandler(debuglevel=0),
    urllib.request.HTTPCookieProcessor(cj))

urllib.request.install_opener(opener)
page = urllib.request.urlopen(url, params).read()
print(page)

html如下：

<HTML>
<HEAD>
<TITLE></TITLE>
</HEAD>
<BODY onLoad="document.AUTOSUBMIT.submit();">
This page is used to hold your data while you are being authorized for your request.<BR><BR>You will be forwarded to continue the authorization process. If this does not happen automatically, please click the Continue button below.
<FORM NAME="AUTOSUBMIT" METHOD="POST" ENCTYPE="application/x-www-form-urlencoded" ACTION="https://myurl.com/blah...">
<INPUT TYPE="HIDDEN" NAME="SMPostPreserve" VALUE="gPcBiwAFuJyNjkpY2Lq/i3Iq80qHAGVxfmNyp3VohPbmmfdNGf5bEhuAZXmqUWJ+ftkJ4uZvFjrnDf+sGAm13m5CDGTljCCF">
<INPUT TYPE="SUBMIT" VALUE="Continue">
</FORM>
</BODY>
</HTML>

Python - 下载文件不起作用

0 个答案: