Question

我正在从100k系统URL中搜集内容（example.com/entry/1＆gt; example.com/entry/100000）。

但是，大约10％的网址已被删除，这意味着当脚本到达它们时，它会给出错误“urllib2.httperror http error 404”并停止运行。

我对python比较陌生，想知道是否有办法做这样的事情：

if result == error:
    div_text = "missing"

这样循环可以继续到下一个URL，但请注意它失败了。

Answer 1

urllib2.HTTPError是Python引发的异常。您可以使用try / except块包装URL调用：

try:
    # ... put your URL open call here ... 
except urllib2.HTTPError:
    div_text = 'missing'

这样，如果再次遇到此异常，Python解释器将运行除block之外的代码。