urllib.request.urlretrieve与代理?

时间:2014-04-09 15:22:41

标签: python-3.x urllib

不知怎的,我不能通过代理服务器下载文件,我不知道我做错了什么。我只是暂停了。有什么建议吗?

import urllib.request

urllib.request.ProxyHandler({"http" : "myproxy:123"})
urllib.request.urlretrieve("http://myfile", "file.file")

2 个答案:

答案 0 :(得分:21)

您需要使用您的代理对象,而不仅仅是实例化它(您创建了一个对象,但没有将它分配给变量,因此无法使用它)。尝试使用这种模式:

#create the object, assign it to a variable
proxy = urllib.request.ProxyHandler({'http': '127.0.0.1'})
# construct a new opener using your proxy settings
opener = urllib.request.build_opener(proxy)
# install the openen on the module-level
urllib.request.install_opener(opener)
# make a request
urllib.request.urlretrieve('http://www.google.com')

或者,如果您不需要依赖std-lib,请使用请求(此代码来自官方文档):

import requests

proxies = {"http": "http://10.10.1.10:3128",
           "https": "http://10.10.1.10:1080"}

requests.get("http://example.org", proxies=proxies)

答案 1 :(得分:0)

urllib 从系统环境中读取代理设置。

根据urllib\request.py中的代码片段,将http_proxy和https_proxy设置为环境变量即可。

与此同时,它也记录在此处:https://www.cmi.ac.in/~madhavan/courses/prog2-2015/docs/python-3.4.2-docs-html/howto/urllib2.html#proxies

    # Proxy handling
    def getproxies_environment():
    """Return a dictionary of scheme -> proxy server URL mappings.

    Scan the environment for variables named <scheme>_proxy;
    this seems to be the standard convention.  If you need a
    different way, you can pass a proxies dictionary to the
    [Fancy]URLopener constructor.

    """
    proxies = {}
    # in order to prefer lowercase variables, process environment in
    # two passes: first matches any, second pass matches lowercase only
    for name, value in os.environ.items():
        name = name.lower()
        if value and name[-6:] == '_proxy':
            proxies[name[:-6]] = value
    # CVE-2016-1000110 - If we are running as CGI script, forget HTTP_PROXY
    # (non-all-lowercase) as it may be set from the web server by a "Proxy:"
    # header from the client
    # If "proxy" is lowercase, it will still be used thanks to the next block
    if 'REQUEST_METHOD' in os.environ:
        proxies.pop('http', None)
    for name, value in os.environ.items():
        if name[-6:] == '_proxy':
            name = name.lower()
            if value:
                proxies[name[:-6]] = value
            else:
                proxies.pop(name[:-6], None)
    return proxies