背景
我的团队最近在DoubleClickSearch(SA360)Reporting API上进行下载时遇到了一些问题,该下载会暂停并暂停几个小时。为了解决这个问题,我们进行了两项更改:
1)在HTTP连接上设置超时
2)使用MediaIoBaseDownload
类以块的形式下载每个报告文件,并在连接超时时重试该块
应用程序不再挂起,但是一旦尝试重试(通过next_chunk
实例的MediaIoBaseDownload
方法),则会引发ResponseNotReady
异常,并且程序将失败。
我们想知道是否有人遇到过类似情况,并且-或可以提出任何可能是该问题的根本原因或可能解决方案的建议。
详细信息
我们的应用程序中使用的代码的简化版本如下:
from io import FileIO
from googleapiclient.discovery import build
from googleapiclient.http import MediaIoBaseDownload
from httplib2 import Http
CHUNK_SIZE = 10 * 1024 * 1024
REPORT_ID = "(use some existing report ID here)"
CREDENTIALS = "(we will have created OAuth2Credentials based on a JSON file)"
DOWNLOAD_RETRY_LIMIT = 3
http = Http(timeout=60)
auth_http = CREDENTIALS.authorize(http)
service = build(serviceName='doubleclicksearch', version='v2', http=auth_http)
# Create request for first report fragment
request = service.reports().getFile(reportId=REPORT_ID, reportFragment=0)
output_file_handle = FileIO(file="/tmp/my_file.csv", mode="wb")
downloader = MediaIoBaseDownload(output_file_handle, request, chunksize=CHUNK_SIZE)
done = False
while done is False:
_, done = downloader.next_chunk(num_retries=DOWNLOAD_RETRY_LIMIT)
这将导致以下错误跟踪:
(googleapiclient.http): Sleeping 1.51 seconds before retry 1 of 3 for media download: GET https://www.googleapis.com/doubleclicksearch/v2/reports/AAAn4KTQbWbVgsE_/files/5?, after ('The read operation timed out',)
......
status, done = downloader.next_chunk(num_retries=download_retry_limit)
File "/usr/local/lib/python2.7/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
return wrapped(*args, **kwargs)
File "/usr/local/lib/python2.7/site-packages/googleapiclient/http.py", line 686, in next_chunk
'GET', headers=headers)
File "/usr/local/lib/python2.7/site-packages/googleapiclient/http.py", line 164, in _retry_request
resp, content = http.request(uri, method, *args, **kwargs)
File "/usr/local/lib/python2.7/site-packages/oauth2client/transport.py", line 169, in new_request
redirections, connection_type)
File "/usr/local/lib/python2.7/site-packages/httplib2/__init__.py", line 2135, in request
cachekey,
File "/usr/local/lib/python2.7/site-packages/httplib2/__init__.py", line 1796, in _request
conn, request_uri, method, body, headers
File "/usr/local/lib/python2.7/site-packages/httplib2/__init__.py", line 1737, in _conn_request
response = conn.getresponse()
File "/usr/lib64/python2.7/httplib.py", line 1124, in getresponse
raise ResponseNotReady()
httplib.ResponseNotReady
NB-该代码通常适用于较小的下载,有时即使较大的下载也可以成功,但是失败频繁。