我有一个网页报废程序,每小时下载几次网页。我得到了15或20次尝试中的大约一次:
[Errno 10054] An existing connection was forcibly closed by the remote host
或
[Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
是否有更好的方法:
def get_page(url):
def get_page_once(url):
try:
page = opener.open(url).read()
except Exception as e:
print('Failed to download %s: %s' % (url,e))
page = ''
return page
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/25.0')]
page = get_page_once(url)
if page == '':
time.sleep(2)
page = get_page_once(url)
return page
我可以进行多次重试,但我担心在此功能上花费太多时间。