无法使用urllib2和请求来获取网址

时间:2014-04-23 05:12:39

标签: python urllib2 python-requests

我试图在远程Ubuntu服务器上执行此操作:

>>> import urllib2, requests
>>> url = 'http://python.org/'
>>> urllib2.urlopen(url)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 406, in open
    response = meth(req, response)
  File "/usr/lib/python2.7/urllib2.py", line 519, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python2.7/urllib2.py", line 444, in error
    return self._call_chain(*args)
  File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 527, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 404: Not Found

>>> requests.get(url)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/django/zyq2/venv/local/lib/python2.7/site-packages/requests/api.py", line 55, in get
    return request('get', url, **kwargs)
  File "/home/django/zyq2/venv/local/lib/python2.7/site-packages/requests/api.py", line 44, in request
    return session.request(method=method, url=url, **kwargs)
  File "/home/django/zyq2/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 382, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/django/zyq2/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 505, in send
    history = [resp for resp in gen] if allow_redirects else []
  File "/home/django/zyq2/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 99, in resolve_redir  ts
    raise TooManyRedirects('Exceeded %s redirects.' % self.max_redirects)
requests.exceptions.TooManyRedirects: Exceeded 30 redirects.

但它在本地Windows机器上运行良好:

>>> urllib2.urlopen(url)
<addinfourl at 57470168 whose fp = <socket._fileobject object at 0x036CB630>>
>>> requests.get(url)
<Response [200]>

我完全不知道发生了什么事情,并希望得到任何建议。

更新

我试过S.M. Al Mamun的建议并得到一个长期追溯的例外:

>>> req = urllib2.Request(url, headers={ 'User-Agent': 'Mozilla/5.0' })
>>> urllib2.urlopen(req).read()
...
long traceback (more than one page)
...
urllib2.HTTPError: HTTP Error 303: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
See Other

再次无限循环(我的意思是TooManyRedirects异常)。

1 个答案:

答案 0 :(得分:0)

尝试使用用户代理:

req = urllib2.Request(url, headers={ 'User-Agent': 'Mozilla/5.0' })
urllib2.urlopen(req).read()

如果它仍然不起作用,那可能是你的Ubuntu离线了!