Question

我的要求是从某个网站http://clientdownload.xyz.com/Documents/abc.zip

下载abc.zip文件

对于这个活动，我编写了一个python脚本，如下所示：

    url_to_check = 'http://clientdownload.xyz.com/Documents/abc.zip'
    username = "user"
    password = "pwd"
    p = urllib2.HTTPPasswordMgrWithDefaultRealm()
    p.add_password(None, url_to_check, username, password)
    handler = urllib2.HTTPBasicAuthHandler(p)
    opener = urllib2.build_opener(handler)
    urllib2.install_opener(opener)
    zip_file = urllib2.urlopen(url_to_check).read()       
    file_name = 'somefile.zip'
    meta = zip_file.info()
    file_size = int(meta.getheaders("Content-Length")[0])
    print "Downloading: %s Bytes: %s" % (file_name, file_size)

    with open(file_name, 'wb') as dwn_file:
        dwn_file.write(zip_file.read())

我在运行脚本时遇到以下错误：

在check_update文件“updateCheck.py”，第68行 zip_file = urllib2.urlopen（url_to_check）.read（）文件“/usr/lib/python2.7/urllib2.py”，第126行，在urlopen中 return _opener.open（url，data，timeout）文件“/usr/lib/python2.7/urllib2.py”，第406行，打开 response = meth（req，response）文件“/usr/lib/python2.7/urllib2.py”，第519行，在http_response中 'http'，request，response，code，msg，hdrs）文件“/usr/lib/python2.7/urllib2.py”，第444行，出错返回self._call_chain（* args）文件“/usr/lib/python2.7/urllib2.py”，第378行，在_call_chain中 result = func（* args）文件“/usr/lib/python2.7/urllib2.py”，第527行，http_error_default 引发HTTPError（req.get_full_url（），code，msg，hdrs，fp）urllib2.HTTPError：HTTP错误401：未经授权

我已正确地提供了用户名和密码，但它会引发未经授权的错误。

当我尝试使用带有-http-user and --ask-password选项的wget链接下载它时，我可以下载该文件。

同样使用相同的脚本，我可以正确地从其他服务器下载文件。

我运行此脚本以获取更多信息：

import urllib2, re, time, sys

theurl='http://clientdownload.xxx.com/Documents/Forms/AllItems.aspx'

req = urllib2.Request(theurl)

try:
    handle = urllib2.urlopen(req)

except IOError, e:

    if hasattr(e, 'code'):

        if e.code != 401:
            print 'We got another error'
            print e.code
        else:
            print e.headers
            print e.headers['www-authenticate']

我收到了以下信息：

Content-Type: text/html; charset=utf-8
Server: Microsoft-IIS/7.5
SPRequestGuid: 939bad00-40b7-49b9-bbbc-99d0267a1004
X-SharePointHealthScore: 0
WWW-Authenticate: NTLM
X-Powered-By: ASP.NET
MicrosoftSharePointTeamServices: 14.0.0.6029
Date: Wed, 12 Feb 2014 13:14:19 GMT
Connection: close
Content-Length: 16

NTLM

Answer 1

您可以考虑使用requests来更轻松地通过HTTP进行交互。在您的情况下，通过安装requests-ntlm（requests的插件），您将以更透明的方式获得NTLM authentication：

import requests
from requests_ntlm import HttpNtlmAuth

r = requests.get("http://ntlm_protected_site.com",auth=HttpNtlmAuth('domain\\username','password'))

r保留回复，包括error codes和headers（特别针对您的案例r.headers.get('Content-Length')[0]）

urllib2.HTTPError：HTTP错误401：未经授权

1 个答案: