我有一个异步功能来从站点获取数据:
async def get_matches_info(url):
async with aiohttp.ClientSession() as session:
try:
async with session.get(url, proxy=proxy) as response:
...
...
...
...
except:
print('ERROR GET URL: ', url)
print(traceback.print_exc())
我有大约200个链接的列表。几乎总是一切都很好,但有时会出现以下错误:
Traceback (most recent call last):
File "C:\Python37\lib\site-packages\aiohttp\connector.py", line 924, in _wrap_create_connection
await self._loop.create_connection(*args, **kwargs))
File "C:\Python37\lib\asyncio\base_events.py", line 986, in create_connection
ssl_handshake_timeout=ssl_handshake_timeout)
File "C:\Python37\lib\asyncio\base_events.py", line 1014, in _create_connection_transport
await waiter
ConnectionResetError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "parser.py", line 90, in get_matches_info
async with session.get(url, proxy=proxy) as response:
File "C:\Python37\lib\site-packages\aiohttp\client.py", line 1005, in __aenter__
self._resp = await self._coro
File "C:\Python37\lib\site-packages\aiohttp\client.py", line 476, in _request
timeout=real_timeout
File "C:\Python37\lib\site-packages\aiohttp\connector.py", line 522, in connect
proto = await self._create_connection(req, traces, timeout)
File "C:\Python37\lib\site-packages\aiohttp\connector.py", line 851, in _create_connection
req, traces, timeout)
File "C:\Python37\lib\site-packages\aiohttp\connector.py", line 1085, in _create_proxy_connection
req=req)
File "C:\Python37\lib\site-packages\aiohttp\connector.py", line 931, in _wrap_create_connection
raise client_error(req.connection_key, exc) from exc
aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host www.myscore.com.ua:443 ssl:None [None]
我检查了所有来自错误的链接-它们正在工作。为什么会发生这种情况?
答案 0 :(得分:0)
这可能是考虑到您正在进行DoS攻击的并发请求的服务器限制。如果您控制服务器并且该服务器正在运行Apache,则可以提高MaxKeepAliveRequests的httpd conf上的限制。 如果没有,您还可以使用其semaphores设置并发异步请求的数量限制。下面的示例将限制为100个并发请求。
async def get_matches_info(url):
sem = asyncio.Semaphore(100)
async with sem:
async with aiohttp.ClientSession() as session:
try:
async with session.get(url, proxy=proxy) as response:
...
请注意,如果您递归调用此函数,则每次都会重置信号量队列,因此您可能需要考虑将此信号量放置在函数外部并将其作为参数传递。