如何使用python请求模块传递代理身份验证(需要摘要身份验证)

时间:2012-11-22 05:38:00

标签: python digest-authentication python-requests proxy-server

我之前使用的是Mechanize模块,现在尝试使用Requests模块。
Python mechanize doesn't work when HTTPS and Proxy Authentication required

我访问互联网时必须通过代理服务器。
代理服务器需要身份验证。我写了以下代码。

import requests
from requests.auth import HTTPProxyAuth

proxies = {"http":"192.168.20.130:8080"}
auth = HTTPProxyAuth("username", "password")

r = requests.get("http://www.google.co.jp/", proxies=proxies, auth=auth)

当代理服务器需要基本身份验证时,上述代码很有效。
现在我想知道当代理服务器需要摘要式身份验证时我必须做什么。
HTTPProxyAuth似乎在摘要式身份验证中无效(r.status_code返回407)。

9 个答案:

答案 0 :(得分:12)

无需实施自己的工作!

现在请求内置了对代理的支持:

proxies = { 'https' : 'https://user:password@proxyip:port' } 
r = requests.get('https://url', proxies=proxies) 

docs

上查看更多内容

这是@BurnsBA的答案,它挽救了我的生命。

注意:必须使用代理服务器的ip而不是其名称!

答案 1 :(得分:8)

我编写了可用于代理身份验证的类(基于摘要身份验证)。
我从requests.auth.HTTPDigestAuth借用了几乎所有代码。

import requests
import requests.auth

class HTTPProxyDigestAuth(requests.auth.HTTPDigestAuth):
    def handle_407(self, r):
        """Takes the given response and tries digest-auth, if needed."""

        num_407_calls = r.request.hooks['response'].count(self.handle_407)

        s_auth = r.headers.get('Proxy-authenticate', '')

        if 'digest' in s_auth.lower() and num_407_calls < 2:

            self.chal = requests.auth.parse_dict_header(s_auth.replace('Digest ', ''))

            # Consume content and release the original connection
            # to allow our new request to reuse the same one.
            r.content
            r.raw.release_conn()

            r.request.headers['Authorization'] = self.build_digest_header(r.request.method, r.request.url)
            r.request.send(anyway=True)
            _r = r.request.response
            _r.history.append(r)

            return _r

        return r

    def __call__(self, r):
        if self.last_nonce:
            r.headers['Proxy-Authorization'] = self.build_digest_header(r.method, r.url)
        r.register_hook('response', self.handle_407)
        return r

用法:

proxies = {
    "http" :"192.168.20.130:8080",
    "https":"192.168.20.130:8080",
}
auth = HTTPProxyDigestAuth("username", "password")

# HTTP
r = requests.get("http://www.google.co.jp/", proxies=proxies, auth=auth)
r.status_code # 200 OK

# HTTPS
r = requests.get("https://www.google.co.jp/", proxies=proxies, auth=auth)
r.status_code # 200 OK

答案 2 :(得分:4)

我已经编写了一个Python模块(可用的here),该模块可以使用摘要方案对HTTP代理进行身份验证。当连接到HTTPS网站(通过猴子补丁)时,它可以工作,并且还可以通过该网站进行身份验证。这应该适用于适用于Python 2和3的最新requests库。

以下示例通过HTTP代理1.2.3.4:8080获取网页https://httpbin.org/ip,该代理需要使用用户名user1和密码password1的HTTP摘要认证:

import requests
from requests_digest_proxy import HTTPProxyDigestAuth

s = requests.Session()
s.proxies = {
        'http': 'http://1.2.3.4:8080/',
        'https': 'http://1.2.3.4:8080/'
}
s.auth = HTTPProxyDigestAuth(('user1', 'password1'))

print(s.get('https://httpbin.org/ip').text)

如果网站需要某种HTTP身份验证,可以通过以下方式将其指定给HTTPProxyDigestAuth构造函数:

# HTTP Basic authentication for website
s.auth = HTTPProxyDigestAuth(('user1', 'password1'),
        auth=requests.auth.HTTPBasicAuth('user1', 'password0'))
print(s.get('https://httpbin.org/basic-auth/user1/password0').text))

# HTTP Digest authentication for website
s.auth = HTTPProxyDigestAuth(('user1', 'password1'),,
        auth=requests.auth.HTTPDigestAuth('user1', 'password0'))
print(s.get('https://httpbin.org/digest-auth/auth/user1/password0').text)

答案 3 :(得分:1)

对于那些仍然在这里的人来说,似乎有一个名为requests-toolbelt的项目,除此之外还有其他常见但未内置的请求功能。

https://toolbelt.readthedocs.org/en/latest/authentication.html#httpproxydigestauth

答案 4 :(得分:1)

import requests
import os


# in my case I had to add my local domain
proxies = {
  'http': 'proxy.myagency.com:8080',
  'https': 'user@localdomain:password@proxy.myagency.com:8080',
}


r=requests.get('https://api.github.com/events', proxies=proxies)
print(r.text)

答案 5 :(得分:1)

此代码段适用于两种类型的请求(httphttps)。在当前版本的请求(2.23.0)上进行了测试。

import re
import requests
from requests.utils import get_auth_from_url
from requests.auth import HTTPDigestAuth
from requests.utils import parse_dict_header
from urllib3.util import parse_url

def get_proxy_autorization_header(proxy, method):
    username, password = get_auth_from_url(proxy)
    auth = HTTPProxyDigestAuth(username, password)
    proxy_url = parse_url(proxy)
    proxy_response = requests.request(method, proxy_url, auth=auth)
    return proxy_response.request.headers['Proxy-Authorization']


class HTTPSAdapterWithProxyDigestAuth(requests.adapters.HTTPAdapter):
    def proxy_headers(self, proxy):
        headers = {}
        proxy_auth_header = get_proxy_autorization_header(proxy, 'CONNECT')
        headers['Proxy-Authorization'] = proxy_auth_header
        return headers


class HTTPAdapterWithProxyDigestAuth(requests.adapters.HTTPAdapter):
    def proxy_headers(self, proxy):
        return {}

    def add_headers(self, request, **kwargs):
        proxy = kwargs['proxies'].get('http', '')
        if proxy:
            proxy_auth_header = get_proxy_autorization_header(proxy, request.method)
            request.headers['Proxy-Authorization'] = proxy_auth_header



class HTTPProxyDigestAuth(requests.auth.HTTPDigestAuth):

    def init_per_thread_state(self):
        # Ensure state is initialized just once per-thread
        if not hasattr(self._thread_local, 'init'):
            self._thread_local.init = True
            self._thread_local.last_nonce = ''
            self._thread_local.nonce_count = 0
            self._thread_local.chal = {}
            self._thread_local.pos = None
            self._thread_local.num_407_calls = None

    def handle_407(self, r, **kwargs):
        """
        Takes the given response and tries digest-auth, if needed.
        :rtype: requests.Response
        """

        # If response is not 407, do not auth
        if r.status_code != 407:
            self._thread_local.num_407_calls = 1
            return r

        s_auth = r.headers.get('proxy-authenticate', '')

        if 'digest' in s_auth.lower() and self._thread_local.num_407_calls < 2:
            self._thread_local.num_407_calls += 1
            pat = re.compile(r'digest ', flags=re.IGNORECASE)
            self._thread_local.chal = requests.utils.parse_dict_header(
                    pat.sub('', s_auth, count=1))

            # Consume content and release the original connection
            # to allow our new request to reuse the same one.
            r.content
            r.close()
            prep = r.request.copy()
            requests.cookies.extract_cookies_to_jar(prep._cookies, r.request, r.raw)
            prep.prepare_cookies(prep._cookies)

            prep.headers['Proxy-Authorization'] = self.build_digest_header(prep.method, prep.url)
            _r = r.connection.send(prep, **kwargs)
            _r.history.append(r)
            _r.request = prep

            return _r

        self._thread_local.num_407_calls = 1
        return r

    def __call__(self, r):
        # Initialize per-thread state, if needed
        self.init_per_thread_state()
        # If we have a saved nonce, skip the 407
        if self._thread_local.last_nonce:
            r.headers['Proxy-Authorization'] = self.build_digest_header(r.method, r.url)

        r.register_hook('response', self.handle_407)
        self._thread_local.num_407_calls = 1

        return r


session = requests.Session()
session.proxies = {
    'http': 'http://username:password@proxyhost:proxyport',
    'https':  'http://username:password@proxyhost:proxyport'
}
session.trust_env = False

session.mount('http://', HTTPAdapterWithProxyDigestAuth())
session.mount('https://', HTTPSAdapterWithProxyDigestAuth())

response_http = session.get("http://ww3.safestyle-windows.co.uk/the-secret-door/")
print(response_http.status_code)

response_https = session.get("https://stackoverflow.com/questions/13506455/how-to-pass-proxy-authentication-requires-digest-auth-by-using-python-requests")
print(response_https.status_code)

通常,在使用协议HTTPS连接时,代理授权的问题也与其他类型的身份验证(ntlm,kerberos)有关。尽管存在很多问题(自2013年以来,也许还有一些我找不到的早期问题):

在请求中:Digest Proxy AuthNTLM Proxy AuthKerberos Proxy Auth

在urlib3中:NTLM Proxy AuthNTLM Proxy Auth

还有许多其他问题,问题仍然没有解决。

模块_tunnel(python2)/ httplib(python3)的函数http.client中问题的根源。如果连接尝试失败,它将引发OSError而不返回响应代码(在本例中为407)以及构建自动化标头所需的其他数据。卢卡萨(Lukasa)作了解释here。 只要urllib3(或请求)的维护者没有解决方案,我们就只能使用各种变通办法(例如,使用@Tey的approach或执行类似this的操作)。解决方法的版本中,我们通过向代理服务器发送请求并处理收到的响应来准备必要的授权数据。

答案 6 :(得分:0)

您可以使用requests.auth.HTTPDigestAuth代替requests.auth.HTTPProxyAuth

来使用摘要式身份验证

答案 7 :(得分:0)

以下是非http基本身份验证的答案-例如组织内的透明代理。

import requests

url      = 'https://someaddress-behindproxy.com'
params   = {'apikey': '123456789'}                     #if you need params
proxies  = {'https': 'https://proxyaddress.com:3128'}  #or some other port
response = requests.get(url, proxies=proxies, params=params)

我希望这对某人有帮助。

答案 8 :(得分:0)

这对我有用。实际上,这种解决方案不了解user:password的安全性:

import requests
import os

http_proxyf = 'http://user:password@proxyip:port'
os.environ["http_proxy"] = http_proxyf
os.environ["https_proxy"] = http_proxyf

sess = requests.Session()
# maybe need sess.trust_env = True
print(sess.get('https://some.org').text)