如何处理requests_futures的速率限制?

时间:2019-10-06 09:02:03

标签: python rest python-requests

我一直在使用Python请求从API中获取数据,但是我想通过与request_futures异步运行来加快速度。每分钟只允许200个API请求,因此我必须检查一下并等待指定的秒数,然后重试。此数字在Retry-After标头中返回。这是原始的工作代码:

  session = requests.Session()
  for id in ticketIds:
    url = 'https://colorfront.zendesk.com/api/v2/tickets/' + str(id) + '/comments.json'
    req = requests.get(url, auth=zd_secret)

    if req.status_code == 429:
      time.sleep(int(req.headers['Retry-After']))
      req = requests.get(url, auth=zd_secret)

    comments += req.json()['comments']

以下异步代码将一直起作用,直到达到速率限制,然后所有请求均失败。

session = FuturesSession()
  futures = {}
  for id in ticketIds:
    url = 'https://colorfront.zendesk.com/api/v2/tickets/' + str(id) + '/comments.json'
    futures[id] = session.get(url, auth=zd_secret)

  for id in ticketIds:
    comments += futures[id].result().json()['comments']

当我达到速率限制时,我需要一种方法仅重试失败的请求。 request_futures是否有内置的方式来处理此问题?

更新:requests_futures库没有为此内置的任何内容。我发现了这个相关的未解决问题:https://github.com/ross/requests-futures/issues/26。由于我知道API的限制,因此我将尽力加快请求的速度,但是如果我组织中的另一个用户同时使用相同的API,这将无济于事。

2 个答案:

答案 0 :(得分:0)

您应该可以使用urllib3.util.retry中的Retry模块来实现此目的:

from requests_futures.sessions import FuturesSession
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

session = FuturesSession()
retries = 5
status_forcelist = [429]
retry = Retry(
     total=retries,
     read=retries,
     connect=retries,
     respect_retry_after_header=True,
     status_forcelist=status_forcelist,
)

adapter = HTTPAdapter(max_retries=retry)
session.mount('http://', adapter)
session.mount('https://', adapter)

futures = {}
for id in ticketIds:
    url = 'https://colorfront.zendesk.com/api/v2/tickets/' + str(id) + '/comments.json'
    futures[id] = session.get(url, auth=zd_secret)

for id in ticketIds:
    comments += futures[id].result().json()['comments']

答案 1 :(得分:0)

我想我已经找到了解决方案。我不知道这是否是最好的方法,但是它避免了另一个依赖。我可以同时处理max_workersx的请求,以根据这家咖啡店的互联网速度来优化效率。

session = FuturesSession(max_workers=2)
futures = {}
res = {}
delay = 0
x = 200
while ticketIds:
  time.sleep(delay)
  if len(ticketIds) > x - 1:
    for id in ticketIds[:x]:
      url = 'https://colorfront.zendesk.com/api/v2/tickets/' + str(id) + '/comments.json'
      futures[id] = session.get(url, auth=zd_secret)
  else:
    for id in ticketIds:
      url = 'https://colorfront.zendesk.com/api/v2/tickets/' + str(id) + '/comments.json'
      futures[id] = session.get(url, auth=zd_secret)

  # use a copy of the list
  for id in ticketIds[:]:
    if id in futures:
      res[id] = futures[id].result()
      # remove successful IDs from list
      if res[id].status_code == 200:
        ticketIds.remove(id)
        comments += res[id].json()['comments']
      else:
        delay = int(res[id].headers['Retry-After'])