使用urllib 2捕获错误60(超时)

时间:2012-11-08 07:38:02

标签: python

我正在尝试捕获错误60并继续执行我的脚本,这就是我现在正在做的事情:

import urllib2
import csv
from bs4 import BeautifulSoup


matcher = csv.reader(open('matcher.csv', "rb" ))

for i in matcher:
    url = i[1]
    if len(list(url)) > 0:
        print url
        try:
            soup = BeautifulSoup(urllib2.urlopen(url,timeout=10))   

        except urllib2.URLError, e:
            print ("There was an error: %r" % e)

它返回:

  

回溯(最近一次调用最后一次):文件“debug.py”,第13行,in          soup = BeautifulSoup(urllib2.urlopen(url,timeout = 10))文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”,   第126行,在urlopen中       return _opener.open(url,data,timeout)文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”,   第400行,开放       response = self._open(req,data)文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”,   第418行,在_open       '_open',req)文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”,   第378行,在_call_chain中       result = func(* args)File“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”,   第1207行,在http_open中       return self.do_open(httplib.HTTPConnection,req)文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”,   第1180行,在do_open中       r = h.getresponse(buffering = True)文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py”,   第1030行,在getresponse中       response.begin()文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py”,   第407行,开头       版本,状态,原因= self._read_status()文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py”,   第365行,在_read_status中       line = self.fp.readline()文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py”,   第447行,在readline中       data = self._sock.recv(self._rbufsize)socket.timeout:timed out

我如何捕获此错误并“继续”?

2 个答案:

答案 0 :(得分:4)

您可以导入例外对象并修改except块:

import socket

try:
    soup = BeautifulSoup(urllib2.urlopen(url,timeout=10))   

except urllib2.URLError as e:
    print ("There was an error: %r" % e)
except socket.timeout as e: # <-------- this block here
    print "We timed out"

更新:嗯,学到了新的东西 - 刚刚找到了对.reason属性的引用:

except urllib2.URLError as e:
    if isinstance(e.reason, socket.timeout):
        pass # ignore this one
    else:
        # do stuff re other errors if you can...
        raise # otherwise propagate the error

答案 1 :(得分:1)

您可以尝试except Exception as e:来捕获所有错误。但请记住,这会捕获所有错误,如果您只想捕获特定错误,应该避免这些错误。

修改 您可以通过执行以下操作来检查异常类型:

except Exception as e:
    exc_type, exc_obj, exc_tb = sys.exc_info()
    fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1]      
    print(exc_type, fname, exc_tb.tb_lineno)