我尝试使用 Python 3.4.3 和{{1}从Zendesk API获取user identities
几十万user id
秒} 图书馆。它适用于许多用户ID,然后我的程序收到来自Zendesk API的错误响应。
以下是相关的Python函数:
requests
在循环中调用此函数,为成千上万的用户检索def get_user_identities(user_id):
url = config.zendesk_api_url + '/api/v2/users/' + user_id + '/identities.json'
session = requests.Session()
session.auth = config.credentials
response = ''
while True:
try:
response = session.get(url)
except requests.ConnectionError as error:
logger.error("ConnectionError: {0}".format(error))
num_seconds = 30
logger.info("Sleeping for {} seconds...".format(num_seconds))
time.sleep(num_seconds)
else:
break
while True:
response = session.get(url)
if response.status_code == 429:
logger.info('Rate limited! Waiting for {} seconds'.format(response.headers['retry-after']))
time.sleep(int(response.headers['retry-after']))
else:
break
if response.status_code != 200:
logger.error('Error with status code {}'.format(response.status_code))
exit()
data = response.json()
,没有任何问题,但由于错误的HTTP响应状态:
user identity
但是,当我使用HTTPie测试相同的网址以获取用户身份时,它可以正常运行:
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/urllib3/connectionpool.py", line 595, in urlopen
chunked=chunked)
File "/usr/local/lib/python3.4/dist-packages/urllib3/connectionpool.py", line 393, in _make_request
six.raise_from(e, None)
File "<string>", line 2, in raise_from
File "/usr/local/lib/python3.4/dist-packages/urllib3/connectionpool.py", line 389, in _make_request
httplib_response = conn.getresponse()
File "/usr/lib/python3.4/http/client.py", line 1171, in getresponse
response.begin()
File "/usr/lib/python3.4/http/client.py", line 351, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.4/http/client.py", line 321, in _read_status
raise BadStatusLine(line)
http.client.BadStatusLine: ''
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 330, in send
timeout=timeout
File "/usr/local/lib/python3.4/dist-packages/urllib3/connectionpool.py", line 640, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/local/lib/python3.4/dist-packages/urllib3/util/retry.py", line 287, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='companyname.zendesk.com', port=443): Max retries exceeded with url: /api/v2/users/1608220001/identities.json (Caused by ProtocolError('Connection aborted.', BadStatusLine("''",)))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/emre.sevinc/code/company-zendesk/get_user_identities.py", line 72, in <module>
get_user_identities(user_id)
File "/home/emre.sevinc/code/company-zendesk/get_user_identities.py", line 42, in get_user_identities
response = session.get(url)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 467, in get
return self.request('GET', url, **kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 455, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 558, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 378, in send
raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='companyname.zendesk.com', port=443): Max retries exceeded with url: /api/v2/users/1608220001/identities.json (Caused by ProtocolError('Connection aborted.', BadStatusLine("''",)))
可以认为Zendesk REST API端点正在思考&#39;我试图&#34;刮擦&#34;它故意断开连接?正如https://stackoverflow.com/a/33226080/236007所建议的那样?
或者它是别的东西,你有什么建议让它起作用吗? (伪造用户代理除外?)
答案 0 :(得分:0)
显然,代码必须再捕获一个例外urllib3.exceptions.MaxRetryError
和HTTP状态代码(BAD_GATEWAY_ERROR = 502
),以解决Zendesk REST API端点引发的问题:
BAD_GATEWAY_ERROR = 502
RATE_LIMITED_ERROR = 429
MAX_NUM_SECONDS_TO_SLEEP = 30
MAX_NUM_OF_ALLOWED_RETRIES = 10
def get_user_identities(user_id):
url = config.zendesk_api_url + '/api/v2/users/' + user_id + '/identities.json'
session = requests.Session()
session.auth = config.credentials
script_path = get_script_path()
num_retries = 0
response = ''
while True:
if num_retries > MAX_NUM_OF_ALLOWED_RETRIES:
logger.error('Tried more than {} times without success. Skipping the user id {} .'
.format(MAX_NUM_OF_ALLOWED_RETRIES, user_id))
return
try:
response = session.get(url)
if response.status_code == RATE_LIMITED_ERROR:
logger.info('Rate limited! Waiting for {} seconds and will try again.'
.format(response.headers['retry-after']))
time.sleep(int(response.headers['retry-after']))
num_retries += 1
continue
if response.status_code == BAD_GATEWAY_ERROR:
logger.info('Bad Gateway Error. Waiting for {} seconds and will try again.'
.format(str(MAX_NUM_SECONDS_TO_SLEEP)))
time.sleep(MAX_NUM_SECONDS_TO_SLEEP)
num_retries += 1
continue
if response.status_code != 200:
logger.error('Error with status code {}. Skipping the user id {}'
.format(response.status_code, user_id))
return
except (requests.ConnectionError, urllib3.exceptions.MaxRetryError) as error:
logger.error("ConnectionError: {0}".format(error))
logger.info("Sleeping for {} seconds...".format(MAX_NUM_SECONDS_TO_SLEEP))
time.sleep(MAX_NUM_SECONDS_TO_SLEEP)
num_retries += 1
else:
break
data = response.json()
在上述更改之后,它能够从Zendesk REST API端点成功检索超过700.000条记录。
我遇到的问题看起来像Zendesk服务器&#39;在这种情况下的行为。