urllib2.HTTPError:HTTP错误404:找不到有效的URL

时间:2014-10-26 22:54:47

标签: python python-2.7 facebook-opengraph urllib2 opengraph

我正在使用python opengraph库来解析网站的opengraph标签https://github.com/erikriver/opengraph

import opengraph
url = 'http://www.foxnews.com/world/2014/10/20/uk-gun-owners-now-subject-to-warrantless-home-searches/'
og = opengraph.OpenGraph(url=url)
print og.to_json()

当我运行此脚本时,我收到以下错误

Traceback (most recent call last):
  File "test.py", line 16, in <module>
    raw = urllib2.urlopen(url)
  File "/usr/lib/python2.7/urllib2.py", line 127, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 410, in open
    response = meth(req, response)
  File "/usr/lib/python2.7/urllib2.py", line 523, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python2.7/urllib2.py", line 448, in error
    return self._call_chain(*args)
  File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 531, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 404: Not Found
在解析之前,

引用了urllib2来获取html https://github.com/erikriver/opengraph/blob/master/opengraph/opengraph.py#L50-L52

为什么我收到404错误?我可以从浏览器访问此URL,并使用此php库https://github.com/scottmac/opengraph检索此URL的打开图形标记。

python库能够为所有其他网址检索打开的图形标记,但这个网址似乎是一个异常。

1 个答案:

答案 0 :(得分:1)

更新

您收到 404 回复,因为您的请求未通过用户代理。 刚刚在virtualenv上安装了opengraph来测试它,它可以在标题中添加缺少的用户代理后工作:

url = 'http://www.foxnews.com/world/2014/10/20/uk-gun-owners-now-subject-to-warrantless-home-searches/'
req = opengraph.opengraph.urllib2.Request(url, headers={ 'User-Agent': 'Mozilla/5.0' })
og = opengraph.OpenGraph()
og.parser(opengraph.opengraph.urllib2.urlopen(req).read())
og.to_json()

'{"site_name": "Fox News", "description": "Registered gun owners in the United Kingdom are now subject to unannounced visits to their homes under new guidance that allows police to inspect firearms storage without a warrant.", "title": "UK gun owners now subject to warrantless home searches", "url": "http://www.foxnews.com/world/2014/10/20/uk-gun-owners-now-subject-to-warrantless-home-searches/", "image": "http://global.fncstatic.com/static/v/all/img/fn_128x128.png", "scrape": false, "_url": null, "type": "article"}'