urllib.error.HTTPError:HTTP错误400:错误请求

时间:2014-11-12 18:49:48

标签: python python-3.x urllib http-status-code-400 urlopen

HY! 我试图打开网页,通常是在浏览器中打开,但是python只是发誓并且不想工作。

import urllib.request, urllib.error
f = urllib.request.urlopen('http://www.booking.com/reviewlist.html?cc1=tr;pagename=sapphire')

另一种方式

import urllib.request, urllib.error
opener=urllib.request.build_opener()
f=opener.open('http://www.booking.com/reviewlist.html?cc1=tr;pagename=sapphi
re')

这两个选项都会出现一种错误:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python34\lib\urllib\request.py", line 461, in open
    response = meth(req, response)
  File "C:\Python34\lib\urllib\request.py", line 571, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Python34\lib\urllib\request.py", line 493, in error
    result = self._call_chain(*args)
  File "C:\Python34\lib\urllib\request.py", line 433, in _call_chain
    result = func(*args)
  File "C:\Python34\lib\urllib\request.py", line 676, in http_error_302
    return self.parent.open(new, timeout=req.timeout)
  File "C:\Python34\lib\urllib\request.py", line 461, in open
    response = meth(req, response)
  File "C:\Python34\lib\urllib\request.py", line 571, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Python34\lib\urllib\request.py", line 499, in error
    return self._call_chain(*args)
  File "C:\Python34\lib\urllib\request.py", line 433, in _call_chain
    result = func(*args)
  File "C:\Python34\lib\urllib\request.py", line 579, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request

有什么想法吗?

2 个答案:

答案 0 :(得分:2)

他们可能会阻止它不是来自浏览器。您可能需要有效的User-Agent标题。

使用请求,这有效:

import requests
headers = 
{
 'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)     Chrome/37.0.2049.0 Safari/537.36'
}

r = requests.get('http://www.booking.com/reviewlist.html?cc1=tr;pagename=sapphire', headers=headers)
print r
print r.headers

答案 1 :(得分:2)

此URL似乎正在执行用户代理字符串检查。如果我将Firefox中的用户代理字符串调整为Python-urllib/2.7,则会因您看到的Bad Request而失败。

在使用urllib时,您可以按照tutorial

调整用户代理
from urllib.request import FancyURLopener

class MyOpener(FancyURLopener):
    version = 'My new User-Agent'   # Set this to a string you want for your user agent

myopener = MyOpener()
page = myopener.open('http://www.booking.com/reviewlist.html?cc1=tr;pagename=sapphire')