我正在运行一个脚本,该脚本需要一个url并在本地下载文件,然后将文件名作为参数传递给函数。问题是这样做花费了很多时间。所以我尝试使用Threadpool,但这没有任何改善。我做错了,这是它的样子。
pool = ThreadPool(processes=8)
ocr_result = pool.apply_async(download_file, (url,))
file_name = ocr_result.get()
async_result = pool.apply_async(return_label, (file_name,))
prediction, prediction_list = async_result.get()
任何建议都会很有帮助。预先感谢。
答案 0 :(得分:1)
如评论中所建议,有一个使用aiohttp
和asyncio
的示例:
def main():
# limit concurrency
loop = asyncio.get_event_loop()
connector = aiohttp.TCPConnector(limit=100)
# login if required
async with aiohttp.ClientSession(loop=loop, connector=connector) as sess:
async with sess.post(
LOGIN_URL, data=payload) as resp:
# ensure login success
assert resp.status == 200
for url in download_links:
await download(url, sess)
您的下载功能如下:
async def download(url, sess):
async with sess.get(url) as resp:
if resp.status == 200:
# post process
最后使用一个主循环:
loop = asyncio.get_event_loop()
loop.run_until_complete(main())