使用session.mount
,我可以要求requests
尽可能多地进行重试。但似乎我无法控制每个请求之间的间隔。
现在我不得不使用这样的代码
for retry in range(1, 5):
logging.warning('[fetch] try=%d, url=%s' % (retry, url))
try:
resp = requests.get(url, timeout = 3)
data = resp.text
except Exception as e:
logging.warning('[try=%d] fetch_list_html: %s' % (retry, e))
pass
if data is False:
time.sleep(retry * 2 + 1)
else:
break
有没有更好的解决方案?
答案 0 :(得分:0)
根据urllib3.util.retry
的源代码,您可以修改backoff_factor
来控制重试之间的延迟:
:param float backoff_factor:
A backoff factor to apply between attempts after the second try
(most errors are resolved immediately by a second try without a
delay). urllib3 will sleep for::
{backoff factor} * (2 ^ ({number of total retries} - 1))
seconds. If the backoff_factor is 0.1, then :func:`.sleep` will sleep
for [0.0s, 0.2s, 0.4s, ...] between retries. It will never be longer
than :attr:`Retry.BACKOFF_MAX`.
By default, backoff is disabled (set to 0).
在您的链接中,它设置为0.3
,也许它对您来说太小了。因此,您可以将其设置为1
。 urllib3
将为[0s, 2s, 4s, ...]
暂停,但不会超过120
答案 1 :(得分:0)
但Timeout也是一种特殊的异常类型:你可以通过查看异常类型来实现:
http://docs.python-requests.org/en/master/user/quickstart/#timeouts
基本上:
for retry in range(1, 5):
logging.warning('[fetch] try=%d, url=%s' % (retry, url))
retry_because_of_timeout = False
try:
resp = requests.get(url, timeout = 3)
data = resp.text
except Timeout as e:
logging.warning('[try=%d] fetch_list_html: %s' % (retry, e))
retry_because_of_timeout = True
except Exception as e:
logging.warning('[try=%d] fetch_list_html: %s' % (retry, e))
pass
if retry_because_of_timeout
time.sleep(retry * 2 + 1)
else:
break
我实际上会重构这个completly虽然是异常导入一个子函数并使用Exception来打破它而不是if if ...