我编写了一个Python 3.7
脚本,该脚本使用单个(asyncio 3.4.3 and aiohttp 3.5.4)
查询的多个对象异步Salesforce
创建(v45.0)
批量API SOQL
作业/批处理每个语句,等待批处理完成,完成后将结果下载(流式传输)到服务器,进行一些数据转换,然后最终将结果同步上传到SQL Server 2016 SP1 (13.0.4560.0)
。我已经进行了很多成功的试运行,并认为它运行良好,但是,我最近开始间歇性地收到以下错误,并且由于如何解决此问题,报告/解决方案很少,这让我很困惑。在网络上:
aiohttp.client_exceptions.ClientPayloadError:响应有效负载不是 完成
示例代码段:
import asyncio,aiohttp,aiofiles
from simple_salesforce import Salesforce
from xml.etree import ElementTree
#Establish a session using the simple_salesforce module
sf = Salesforce(username=username,
password=password,
security_token=securityToken,
organizationId=organizationId)
sfAPIURL = 'https://myinstance.salesforce.com/services/async/45.0/job/'
sfDataPath = 'C:/Salesforce/Data/'
#Dictionary to store information for the object/job/batch while the script is executing
objectDictionary =
{'Account': {'job':
{'batch': {'id': '8596P00000ihwpJulI','results': ['8596V00000Bo9iU'],'state': 'Completed'},
'id': '8752R00000iUjtReqS'},
'soql': 'select Id,Name from Account'},
'Contact': {'job':
{'batch': {'id': '9874G00000iJnBbVgg','results': ['7410t00000Ao9vp'],'state': 'Completed'},
'id': '8800o00000POIkLlLa'},
'soql': 'select Id,Name from Contact'}}
async def retrieveResults(jobId, batchId, sfObject):
headers = {"X-SFDC-Session": sf.session_id, 'Content-Encoding': 'gzip'}
async with aiohttp.ClientSession() as session:
async with session.get(url=f'{sfAPIURL}{jobId}/batch/{batchId}/result', headers=headers) as r:
data = await r.text()
batchResults = ElementTree.fromstring(data) #list of batch results
for resultID in batchResults:
async with session.get(url=f'{sfAPIURL}{jobId}/batch/{batchId}/result/{resultID.text}', headers=headers, timeout=None) as r:
async with aiofiles.open(f'{sfDataPath}{sfObject}_TEMP_JOB_{jobId}_BATCH_{batchId}_RESULT_{resultID.text}.csv', 'wb') as outfile: #save in temporary file for manipulation later
while True:
chunk = await r.content.read(81920)
if not chunk:
break
await outfile.write(chunk)
async def asyncDownload():
await asyncio.gather(*[retrieveResults(objectDictionary[sfObject]['job']['id'], objectDictionary[sfObject]['job']['batch']['id'], sfObject) for sfObject in objectDictionary])
if __name__ == "__main__":
asyncio.run(asyncDownload())
跟踪(错误行与上面的代码段不匹配):
回溯(最近通话最近一次):
文件“ C:\ Code \ salesforce.py”,第252行,在 asyncio.run(asyncDownload())
文件“ C:\ Program Files \ Python37 \ lib \ asyncio \ runners.py”,第43行,在 跑 返回loop.run_until_complete(main)
文件“ C:\ Program Files \ Python37 \ lib \ asyncio \ base_events.py”,行 584,在run_until_complete中 返回future.result()
文件“ C:\ Code \ salesforce.py”,行241,在asyncDownload中 等待asyncio.gather(* [retrieveResults(objectDictionary [sfObject] ['job'] ['id'], objectDictionary [sfObject] ['job'] ['batch'] ['id'],sfObject) sfObject in objectDictionary])
文件“ C:\ Code \ salesforce.py”在第183行中 检索结果 块=等待r.content.read(81920)
文件“ C:\ Program 文件\ Python37 \ lib \ site-packages \ aiohttp \ streams.py”,第369行,在 读 等待self._wait('read')
文件“ C:\ Program 文件\ Python37 \ lib \ site-packages \ aiohttp \ streams.py”,第297行,在 _等待 等待服务员
aiohttp.client_exceptions.ClientPayloadError:响应有效负载不是 完成
问题的根源似乎始于r.content.read(81920)
,它应该以81920字节的数据块形式流式传输数据,但这大约是我能得到的。
我知道这不是网络问题,因为此服务器上还有其他连接到外部源的小型作业,这些作业在运行时顺利完成。有人知道这里发生了什么吗?难道我做错了什么?是我的代码不好还是什么?
谢谢!
-编辑:
我尝试使用iter_any()
而不是read()
,但仍然遇到相同的错误...
async for data in r.content.iter_any():
await outfile.write(data)
我尝试过readline()
,但仍然遇到相同的错误...
async for line in r.content.readline():
await outfile.write(line)