文档说要重用ClientSession:
不要为每个请求创建会话。最有可能你需要一个会话 完全执行所有请求的应用程序。
会话内部包含连接池,连接重用和 keep-alives(两者都默认开启)可以加快总体性能。1
但是在文档中似乎没有关于如何做到这一点的任何解释?有一个例子可能是相关的,但它没有说明如何在其他地方重用池:http://aiohttp.readthedocs.io/en/stable/client.html#keep-alive-connection-pooling-and-cookie-sharing
这样的事情是否正确?
@app.listener('before_server_start')
async def before_server_start(app, loop):
app.pg_pool = await asyncpg.create_pool(**DB_CONFIG, loop=loop, max_size=100)
app.http_session_pool = aiohttp.ClientSession()
@app.listener('after_server_stop')
async def after_server_stop(app, loop):
app.http_session_pool.close()
app.pg_pool.close()
@app.post("/api/register")
async def register(request):
# json validation
async with app.pg_pool.acquire() as pg:
await pg.execute() # create unactivated user in db
async with app.http_session_pool as session:
# TODO send activation email using SES API
async with session.post('http://httpbin.org/post', data=b'data') as resp:
print(resp.status)
print(await resp.text())
return HTTPResponse(status=204)
答案 0 :(得分:6)
我认为可以改进的事情很少:
1)
ClientSession
的实例是一个会话对象。这个会话包含连接池,但它本身不是“session_pool”。我建议将http_session_pool
重命名为http_session
,或者client_session
。
<强> 2)强>
会话的close()
方法is a corountine。你应该等待它:
await app.client_session.close()
甚至更好(恕我直言),而不是考虑如何正确打开/关闭会话使用标准异步上下文管理器等待__aenter__
/ __aexit__
:
@app.listener('before_server_start')
async def before_server_start(app, loop):
# ...
app.client_session = await aiohttp.ClientSession().__aenter__()
@app.listener('after_server_stop')
async def after_server_stop(app, loop):
await app.client_session.__aexit__(None, None, None)
# ...
第3)强>
注意this info:
但是,如果事件循环在底层连接之前停止 关闭后,会发出
ResourceWarning: unclosed transport
警告 (启用警告时)。为避免这种情况,必须在关闭前添加一小段延迟 事件循环允许任何打开的底层连接关闭。
我不确定这是强制性的,但在await asyncio.sleep(0)
after_server_stop
内添加@app.listener('after_server_stop')
async def after_server_stop(app, loop):
# ...
await asyncio.sleep(0) # http://aiohttp.readthedocs.io/en/stable/client.html#graceful-shutdown
作为文档建议并不是很糟糕:
__aenter__
<强> UPD:强>
实现__aexit__
/ async with
的类可以用作async context manager(可以在asyncio
语句中使用)。它允许在执行内部块之前和之后执行一些操作。这与常规上下文管理器非常相似,但async with
相关。与常规上下文管理器异步一样,可以直接使用(不__aenter__
}手动等待__aexit__
/ __aenter__
。
为什么我认为最好使用__aexit__
/ close()
手动创建/免费会话,而不是使用__aenter__
?因为我们不应该担心__aexit__
/ aiohttp
中实际发生的事情。想象一下,在open()
的未来版本中,会改变会话的创建,例如需要等待__aenter__
。如果您使用__aexit__
/ import org.lwjgl.opengl.Display;
import org.lwjgl.opengl.DisplayMode;
public void createWindow() throws Exception {
Display.setFullscreen(false);
Display.setDisplayMode(new DisplayMode(640, 480));
Display.create();
}
,则无需以某种方式更改代码。
答案 1 :(得分:1)
在我的代码触发以下警告消息后,我在Google上搜索了如何重用aiohttp ClientSession实例后发现了这个问题:UserWarning:在协程之外创建客户端会话是非常危险的想法
此代码虽然相关,但可能无法解决上述问题。我是asyncio和aiohttp的新手,所以这可能不是最佳实践。阅读大量看似相互矛盾的信息后,我能想到的最好的方法。
我从Python文档中创建了一个类ResourceManager,该类打开了上下文。
ResourceManager实例通过魔术方法__aenter__
和__aexit__
使用BaseScraper.set_session和BaseScraper.close_session包装器方法来处理aiohttp ClientSession实例的打开和关闭。
我可以使用以下代码重用ClientSession实例。
BaseScraper类还具有身份验证方法。这取决于lxml第三方软件包。
import asyncio
from time import time
from contextlib import contextmanager, AbstractContextManager, ExitStack
import aiohttp
import lxml.html
class ResourceManager(AbstractContextManager):
# Code taken from Python docs: 29.6.2.4. of https://docs.python.org/3.6/library/contextlib.html
def __init__(self, scraper, check_resource_ok=None):
self.acquire_resource = scraper.acquire_resource
self.release_resource = scraper.release_resource
if check_resource_ok is None:
def check_resource_ok(resource):
return True
self.check_resource_ok = check_resource_ok
@contextmanager
def _cleanup_on_error(self):
with ExitStack() as stack:
stack.push(self)
yield
# The validation check passed and didn't raise an exception
# Accordingly, we want to keep the resource, and pass it
# back to our caller
stack.pop_all()
def __enter__(self):
resource = self.acquire_resource()
with self._cleanup_on_error():
if not self.check_resource_ok(resource):
msg = "Failed validation for {!r}"
raise RuntimeError(msg.format(resource))
return resource
def __exit__(self, *exc_details):
# We don't need to duplicate any of our resource release logic
self.release_resource()
class BaseScraper:
login_url = ""
login_data = dict() # dict of key, value pairs to fill the login form
loop = asyncio.get_event_loop()
def __init__(self, urls):
self.urls = urls
self.acquire_resource = self.set_session
self.release_resource = self.close_session
async def _set_session(self):
self.session = await aiohttp.ClientSession().__aenter__()
def set_session(self):
set_session_attr = self.loop.create_task(self._set_session())
self.loop.run_until_complete(set_session_attr)
return self # variable after "as" becomes instance of BaseScraper
async def _close_session(self):
await self.session.__aexit__(None, None, None)
def close_session(self):
close_session = self.loop.create_task(self._close_session())
self.loop.run_until_complete(close_session)
def __call__(self):
fetch_urls = self.loop.create_task(self._fetch())
return self.loop.run_until_complete(fetch_urls)
async def _get(self, url):
async with self.session.get(url) as response:
result = await response.read()
return url, result
async def _fetch(self):
tasks = (self.loop.create_task(self._get(url)) for url in self.urls)
start = time()
results = await asyncio.gather(*tasks)
print(
"time elapsed: {} seconds \nurls count: {}".format(
time() - start, len(urls)
)
)
return results
@property
def form(self):
"""Create and return form for authentication."""
form = aiohttp.FormData(self.login_data)
get_login_page = self.loop.create_task(self._get(self.login_url))
url, login_page = self.loop.run_until_complete(get_login_page)
login_html = lxml.html.fromstring(login_page)
hidden_inputs = login_html.xpath(r'//form//input[@type="hidden"]')
login_form = {x.attrib["name"]: x.attrib["value"] for x in hidden_inputs}
for key, value in login_form.items():
form.add_field(key, value)
return form
async def _login(self, form):
async with self.session.post(self.login_url, data=form) as response:
if response.status != 200:
response.raise_for_status()
print("logged into {}".format(url))
await response.release()
def login(self):
post_login_form = self.loop.create_task(self._login(self.form))
self.loop.run_until_complete(post_login_form)
if __name__ == "__main__":
urls = ("http://example.com",) * 10
base_scraper = BaseScraper(urls)
with ResourceManager(base_scraper) as scraper:
for url, html in scraper():
print(url, len(html))