从Zendesk API获取数据时,为什么使用ProtocolError('连接中止。',BadStatusLine("''"",))的HTTP状态不佳?

时间:2017-09-12 15:33:35

标签: python python-3.x python-requests zendesk-api

我尝试使用 Python 3.4.3 和{{1}从Zendesk API获取user identities几十万user id秒} 图书馆。它适用于许多用户ID,然后我的程序收到来自Zendesk API的错误响应。

以下是相关的Python函数:

requests

在循环中调用此函数,为成千上万的用户检索def get_user_identities(user_id): url = config.zendesk_api_url + '/api/v2/users/' + user_id + '/identities.json' session = requests.Session() session.auth = config.credentials response = '' while True: try: response = session.get(url) except requests.ConnectionError as error: logger.error("ConnectionError: {0}".format(error)) num_seconds = 30 logger.info("Sleeping for {} seconds...".format(num_seconds)) time.sleep(num_seconds) else: break while True: response = session.get(url) if response.status_code == 429: logger.info('Rate limited! Waiting for {} seconds'.format(response.headers['retry-after'])) time.sleep(int(response.headers['retry-after'])) else: break if response.status_code != 200: logger.error('Error with status code {}'.format(response.status_code)) exit() data = response.json() ,没有任何问题,但由于错误的HTTP响应状态:

user identity

但是,当我使用HTTPie测试相同的网址以获取用户身份时,它可以正常运行:

Traceback (most recent call last):
  File "/usr/local/lib/python3.4/dist-packages/urllib3/connectionpool.py", line 595, in urlopen
    chunked=chunked)
  File "/usr/local/lib/python3.4/dist-packages/urllib3/connectionpool.py", line 393, in _make_request
    six.raise_from(e, None)
  File "<string>", line 2, in raise_from
  File "/usr/local/lib/python3.4/dist-packages/urllib3/connectionpool.py", line 389, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/lib/python3.4/http/client.py", line 1171, in getresponse
    response.begin()
  File "/usr/lib/python3.4/http/client.py", line 351, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.4/http/client.py", line 321, in _read_status
    raise BadStatusLine(line)
http.client.BadStatusLine: ''

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 330, in send
    timeout=timeout
  File "/usr/local/lib/python3.4/dist-packages/urllib3/connectionpool.py", line 640, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/local/lib/python3.4/dist-packages/urllib3/util/retry.py", line 287, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='companyname.zendesk.com', port=443): Max retries exceeded with url: /api/v2/users/1608220001/identities.json (Caused by ProtocolError('Connection aborted.', BadStatusLine("''",)))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/emre.sevinc/code/company-zendesk/get_user_identities.py", line 72, in <module>
    get_user_identities(user_id)
  File "/home/emre.sevinc/code/company-zendesk/get_user_identities.py", line 42, in get_user_identities
    response = session.get(url)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 467, in get
    return self.request('GET', url, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 455, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 558, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 378, in send
    raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='companyname.zendesk.com', port=443): Max retries exceeded with url: /api/v2/users/1608220001/identities.json (Caused by ProtocolError('Connection aborted.', BadStatusLine("''",)))

可以认为Zendesk REST API端点正在思考&#39;我试图&#34;刮擦&#34;它故意断开连接?正如https://stackoverflow.com/a/33226080/236007所建议的那样?

或者它是别的东西,你有什么建议让它起作用吗? (伪造用户代理除外?)

1 个答案:

答案 0 :(得分:0)

显然,代码必须再捕获一个例外urllib3.exceptions.MaxRetryError和HTTP状态代码(BAD_GATEWAY_ERROR = 502),以解决Zendesk REST API端点引发的问题:

BAD_GATEWAY_ERROR = 502
RATE_LIMITED_ERROR = 429
MAX_NUM_SECONDS_TO_SLEEP = 30
MAX_NUM_OF_ALLOWED_RETRIES = 10


def get_user_identities(user_id):
  url = config.zendesk_api_url + '/api/v2/users/' + user_id + '/identities.json'

  session = requests.Session()
  session.auth = config.credentials

  script_path = get_script_path()

  num_retries = 0
  response = ''

  while True:
    if num_retries > MAX_NUM_OF_ALLOWED_RETRIES:
      logger.error('Tried more than {} times without success. Skipping the user id {} .'
                   .format(MAX_NUM_OF_ALLOWED_RETRIES, user_id))
      return

    try:
      response = session.get(url)

      if response.status_code == RATE_LIMITED_ERROR:
        logger.info('Rate limited! Waiting for {} seconds and will try again.'
                    .format(response.headers['retry-after']))
        time.sleep(int(response.headers['retry-after']))
        num_retries += 1
        continue

      if response.status_code == BAD_GATEWAY_ERROR:
        logger.info('Bad Gateway Error. Waiting for {} seconds and will try again.'
                    .format(str(MAX_NUM_SECONDS_TO_SLEEP)))
        time.sleep(MAX_NUM_SECONDS_TO_SLEEP)
        num_retries += 1
        continue

      if response.status_code != 200:
        logger.error('Error with status code {}. Skipping the user id {}'
                     .format(response.status_code, user_id))
        return

    except (requests.ConnectionError, urllib3.exceptions.MaxRetryError) as error:
      logger.error("ConnectionError: {0}".format(error))
      logger.info("Sleeping for {} seconds...".format(MAX_NUM_SECONDS_TO_SLEEP))
      time.sleep(MAX_NUM_SECONDS_TO_SLEEP)
      num_retries += 1
    else:
      break

  data = response.json()

在上述更改之后,它能够从Zendesk REST API端点成功检索超过700.000条记录。

我遇到的问题看起来像Zendesk服务器&#39;在这种情况下的行为。