我正在使用Python请求库来获取html页面的标头并使用它来获取编码。但是请求无法获得标题的一些链接。对于这种情况,我想使用编码“utf-8”。我该如何处理这类案件?如何处理requests.head返回的错误。
这是我的代码:
r = requests.head(link) #how to handle error in case this fails?
charset = r.encoding
if (not charset):
charset = "utf-8"
当请求无法获取标题时我收到错误:
File "parsexml.py", line 78, in parsefile
r = requests.head(link)
File "/usr/lib/python2.7/dist-packages/requests/api.py", line 74, in head
return request('head', url, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/api.py", line 40, in request
return s.request(method=method, url=url, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 229, in request
r.send(prefetch=prefetch)
File "/usr/lib/python2.7/dist-packages/requests/models.py", line 605, in send
raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='www.standardzilla.com', port=80): Max retries exceeded with url: /2008/08/01/diaries-of-a-freelancer-day-thirty-seven/
答案 0 :(得分:2)
您应该将代码放在try-except块中,捕获ConnectionErrors。像这样:
try:
r = requests.head(link) //how to handle error in case this fails?
charset = r.encoding
if (not charset):
charset = "utf-8"
except requests.exceptions.ConnectionError:
print 'Unable to access ' + link