Question

我编写了一个Python 3.7脚本，该脚本使用单个(asyncio 3.4.3 and aiohttp 3.5.4)查询的多个对象异步Salesforce创建(v45.0)批量API SOQL作业/批处理每个语句，等待批处理完成，完成后将结果下载（流式传输）到服务器，进行一些数据转换，然后最终将结果同步上传到SQL Server 2016 SP1 (13.0.4560.0)。我已经进行了很多成功的试运行，并认为它运行良好，但是，我最近开始间歇性地收到以下错误，并且由于如何解决此问题，报告/解决方案很少，这让我很困惑。在网络上：

aiohttp.client_exceptions.ClientPayloadError：响应有效负载不是完成

示例代码段：

import asyncio,aiohttp,aiofiles
from simple_salesforce import Salesforce
from xml.etree import ElementTree

#Establish a session using the simple_salesforce module
sf = Salesforce(username=username,
                password=password,
                security_token=securityToken,
                organizationId=organizationId)
sfAPIURL = 'https://myinstance.salesforce.com/services/async/45.0/job/'
sfDataPath = 'C:/Salesforce/Data/'

#Dictionary to store information for the object/job/batch while the script is executing
objectDictionary = 
{'Account': {'job':
                {'batch': {'id': '8596P00000ihwpJulI','results': ['8596V00000Bo9iU'],'state': 'Completed'},
             'id': '8752R00000iUjtReqS'},
             'soql': 'select Id,Name from Account'},

 'Contact': {'job':
                {'batch': {'id': '9874G00000iJnBbVgg','results': ['7410t00000Ao9vp'],'state': 'Completed'},
             'id': '8800o00000POIkLlLa'},
             'soql': 'select Id,Name from Contact'}}

async def retrieveResults(jobId, batchId, sfObject):
    headers = {"X-SFDC-Session": sf.session_id, 'Content-Encoding': 'gzip'}
    async with aiohttp.ClientSession() as session:
        async with session.get(url=f'{sfAPIURL}{jobId}/batch/{batchId}/result', headers=headers) as r:
            data = await r.text()
            batchResults = ElementTree.fromstring(data) #list of batch results
            for resultID in batchResults:
                async with session.get(url=f'{sfAPIURL}{jobId}/batch/{batchId}/result/{resultID.text}', headers=headers, timeout=None) as r:
                    async with aiofiles.open(f'{sfDataPath}{sfObject}_TEMP_JOB_{jobId}_BATCH_{batchId}_RESULT_{resultID.text}.csv', 'wb') as outfile: #save in temporary file for manipulation later
                        while True:
                            chunk = await r.content.read(81920)
                            if not chunk:
                                break
                            await outfile.write(chunk)

async def asyncDownload():
    await asyncio.gather(*[retrieveResults(objectDictionary[sfObject]['job']['id'], objectDictionary[sfObject]['job']['batch']['id'], sfObject) for sfObject in objectDictionary])

if __name__ == "__main__":
    asyncio.run(asyncDownload())

跟踪（错误行与上面的代码段不匹配）：

回溯（最近通话最近一次）：

文件“ C：\ Code \ salesforce.py”，第252行，在       asyncio.run（asyncDownload（））

文件“ C：\ Program Files \ Python37 \ lib \ asyncio \ runners.py”，第43行，在   跑       返回loop.run_until_complete（main）

文件“ C：\ Program Files \ Python37 \ lib \ asyncio \ base_events.py”，行   584，在run_until_complete中       返回future.result（）

文件“ C：\ Code \ salesforce.py”，行241，在asyncDownload中       等待asyncio.gather（* [retrieveResults（objectDictionary [sfObject] ['job'] ['id']，   objectDictionary [sfObject] ['job'] ['batch'] ['id']，sfObject）   sfObject in objectDictionary]）

文件“ C：\ Code \ salesforce.py”在第183行中   检索结果       块=等待r.content.read（81920）

文件“ C：\ Program   文件\ Python37 \ lib \ site-packages \ aiohttp \ streams.py”，第369行，在   读       等待self._wait（'read'）

文件“ C：\ Program   文件\ Python37 \ lib \ site-packages \ aiohttp \ streams.py”，第297行，在   _等待       等待服务员

aiohttp.client_exceptions.ClientPayloadError：响应有效负载不是   完成

问题的根源似乎始于r.content.read(81920)，它应该以81920字节的数据块形式流式传输数据，但这大约是我能得到的。

我知道这不是网络问题，因为此服务器上还有其他连接到外部源的小型作业，这些作业在运行时顺利完成。有人知道这里发生了什么吗？难道我做错了什么？是我的代码不好还是什么？

谢谢！

-编辑：

我尝试使用iter_any()而不是read()，但仍然遇到相同的错误...

async for data in r.content.iter_any():
    await outfile.write(data)

我尝试过readline()，但仍然遇到相同的错误...

async for line in r.content.readline():
    await outfile.write(line)

响应有效负载未使用asyncio / aiohttp

0 个答案: