Question

我尝试使用 Python 3.4.3 和{{1}从Zendesk API获取user identities几十万user id秒} 图书馆。它适用于许多用户ID，然后我的程序收到来自Zendesk API的错误响应。

以下是相关的Python函数：

requests

在循环中调用此函数，为成千上万的用户检索def get_user_identities(user_id): url = config.zendesk_api_url + '/api/v2/users/' + user_id + '/identities.json' session = requests.Session() session.auth = config.credentials response = '' while True: try: response = session.get(url) except requests.ConnectionError as error: logger.error("ConnectionError: {0}".format(error)) num_seconds = 30 logger.info("Sleeping for {} seconds...".format(num_seconds)) time.sleep(num_seconds) else: break while True: response = session.get(url) if response.status_code == 429: logger.info('Rate limited! Waiting for {} seconds'.format(response.headers['retry-after'])) time.sleep(int(response.headers['retry-after'])) else: break if response.status_code != 200: logger.error('Error with status code {}'.format(response.status_code)) exit() data = response.json()，没有任何问题，但由于错误的HTTP响应状态：

user identity

但是，当我使用HTTPie测试相同的网址以获取用户身份时，它可以正常运行：

Traceback (most recent call last): File "/usr/local/lib/python3.4/dist-packages/urllib3/connectionpool.py", line 595, in urlopen chunked=chunked) File "/usr/local/lib/python3.4/dist-packages/urllib3/connectionpool.py", line 393, in _make_request six.raise_from(e, None) File "<string>", line 2, in raise_from File "/usr/local/lib/python3.4/dist-packages/urllib3/connectionpool.py", line 389, in _make_request httplib_response = conn.getresponse() File "/usr/lib/python3.4/http/client.py", line 1171, in getresponse response.begin() File "/usr/lib/python3.4/http/client.py", line 351, in begin version, status, reason = self._read_status() File "/usr/lib/python3.4/http/client.py", line 321, in _read_status raise BadStatusLine(line) http.client.BadStatusLine: '' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3/dist-packages/requests/adapters.py", line 330, in send timeout=timeout File "/usr/local/lib/python3.4/dist-packages/urllib3/connectionpool.py", line 640, in urlopen _stacktrace=sys.exc_info()[2]) File "/usr/local/lib/python3.4/dist-packages/urllib3/util/retry.py", line 287, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='companyname.zendesk.com', port=443): Max retries exceeded with url: /api/v2/users/1608220001/identities.json (Caused by ProtocolError('Connection aborted.', BadStatusLine("''",))) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/emre.sevinc/code/company-zendesk/get_user_identities.py", line 72, in <module> get_user_identities(user_id) File "/home/emre.sevinc/code/company-zendesk/get_user_identities.py", line 42, in get_user_identities response = session.get(url) File "/usr/lib/python3/dist-packages/requests/sessions.py", line 467, in get return self.request('GET', url, **kwargs) File "/usr/lib/python3/dist-packages/requests/sessions.py", line 455, in request resp = self.send(prep, **send_kwargs) File "/usr/lib/python3/dist-packages/requests/sessions.py", line 558, in send r = adapter.send(request, **kwargs) File "/usr/lib/python3/dist-packages/requests/adapters.py", line 378, in send raise ConnectionError(e) requests.exceptions.ConnectionError: HTTPSConnectionPool(host='companyname.zendesk.com', port=443): Max retries exceeded with url: /api/v2/users/1608220001/identities.json (Caused by ProtocolError('Connection aborted.', BadStatusLine("''",)))

可以认为Zendesk REST API端点正在思考＆＃39;我试图＆＃34;刮擦＆＃34;它故意断开连接？正如https://stackoverflow.com/a/33226080/236007所建议的那样？

或者它是别的东西，你有什么建议让它起作用吗？（伪造用户代理除外？）

Answer 1

显然，代码必须再捕获一个例外urllib3.exceptions.MaxRetryError和HTTP状态代码（BAD_GATEWAY_ERROR = 502），以解决Zendesk REST API端点引发的问题：

BAD_GATEWAY_ERROR = 502
RATE_LIMITED_ERROR = 429
MAX_NUM_SECONDS_TO_SLEEP = 30
MAX_NUM_OF_ALLOWED_RETRIES = 10


def get_user_identities(user_id):
  url = config.zendesk_api_url + '/api/v2/users/' + user_id + '/identities.json'

  session = requests.Session()
  session.auth = config.credentials

  script_path = get_script_path()

  num_retries = 0
  response = ''

  while True:
    if num_retries > MAX_NUM_OF_ALLOWED_RETRIES:
      logger.error('Tried more than {} times without success. Skipping the user id {} .'
                   .format(MAX_NUM_OF_ALLOWED_RETRIES, user_id))
      return

    try:
      response = session.get(url)

      if response.status_code == RATE_LIMITED_ERROR:
        logger.info('Rate limited! Waiting for {} seconds and will try again.'
                    .format(response.headers['retry-after']))
        time.sleep(int(response.headers['retry-after']))
        num_retries += 1
        continue

      if response.status_code == BAD_GATEWAY_ERROR:
        logger.info('Bad Gateway Error. Waiting for {} seconds and will try again.'
                    .format(str(MAX_NUM_SECONDS_TO_SLEEP)))
        time.sleep(MAX_NUM_SECONDS_TO_SLEEP)
        num_retries += 1
        continue

      if response.status_code != 200:
        logger.error('Error with status code {}. Skipping the user id {}'
                     .format(response.status_code, user_id))
        return

    except (requests.ConnectionError, urllib3.exceptions.MaxRetryError) as error:
      logger.error("ConnectionError: {0}".format(error))
      logger.info("Sleeping for {} seconds...".format(MAX_NUM_SECONDS_TO_SLEEP))
      time.sleep(MAX_NUM_SECONDS_TO_SLEEP)
      num_retries += 1
    else:
      break

  data = response.json()

在上述更改之后，它能够从Zendesk REST API端点成功检索超过700.000条记录。

我遇到的问题看起来像Zendesk服务器＆＃39;在这种情况下的行为。

从Zendesk API获取数据时，为什么使用ProtocolError（＆＃39;连接中止。＆＃39;，BadStatusLine（＆＃34;＆＃39;＆＃39;＆＃34;＆＃34;，））的HTTP状态不佳？

1 个答案: