Question

我有这个脚本来抓取网站并找到我需要的项目..

from socket import timeout
from urllib.request import Request, urlopen, URLError
import bs4,urllib.parse
def track(self):
    for _object in _objects:
        req = Request('http://example.com/item.php?id='+str(_object))
        req.add_header('User-Agent',
                       'Mozilla 5.0')
        _URL = urlopen(req).read()
        soup = bs4.BeautifulSoup(_URL, "html.parser")
        allResults = []
        i = 1

        for hit in soup.findAll('cite'):
            if ("% Off" in hit.text):
                allResults.append(str(i) + ". " + hit.text + " | Item => " + _object)
                i += 1

        if (len(allResults) == 0):
            print("No result found for this item => " + _object)
        else:
            for element in allResults:
                print(element)

我想抛出异常，所以当网站连接失败时，或者由于任何其他原因它无法访问网址时，它会打印出“＃34;发生了错误的事情＆＃34;

我知道我必须使用socket.timeout但是我应该把它放在代码中？

Answer 1

将urlopen调用包装成try：except call：

try: 
  _URL = urlopen(req).read()
except Exception as e:
  print("Something happened wrong: {}".format(e))
  # do something, eg: continue

在哪里和什么例外我应该使用？ URLLib Python3

1 个答案: