我有一个为我编写的脚本,我无法执行它...我收到以下错误...
回溯(最近通话最近一次):
文件“ crawler.py”,第56行,在 loop.run_until_complete(未来)文件“ C:\ Users \ lisa \ AppData \ Local \ Programs \ Python \ Python37-32 \ lib \ asyncio \ base_events.py”, 第568行,在run_until_complete中 在运行中返回future.result()文件“ crawler.py”,第51行 等待响应,bound_fetch中的文件“ crawler.py”,第32行 等待获取(URL,会话)文件“ crawler.py”,第22行,在获取中 与session.get(url,headers = headers)异步作为响应:文件“ C:\ Users \ lisa \ AppData \ Local \ Programs \ Python \ Python37-32 \ lib \ site-packages \ aiohttp \ client.py”, 第843行,位于 enter self._resp =等待self._coro文件“ C:\ Users \ lisa \ AppData \ Local \ Programs \ Python \ Python37-32 \ lib \ site-packages \ aiohttp \ client.py”, _request中的第387行 等待resp.start(conn)文件“ C:\ Users \ lisa \ AppData \ Local \ Programs \ Python \ Python37-32 \ lib \ site-packages \ aiohttp \ client_reqrep.py”, 第748行,开始时 消息,有效负载=等待self._protocol.read()文件“ C:\ Users \ lisa \ AppData \ Local \ Programs \ Python \ Python37-32 \ lib \ site-packages \ aiohttp \ streams.py”, 第533行,处于读取状态 等待self._waiter aiohttp.client_exceptions.ServerDisconnectedError:无
我明显缺少什么吗?我可以在不使用线程的情况下运行相同的脚本,谢谢...
import random
import asyncio
from aiohttp import ClientSession
import requests
from itertools import product
from string import *
from multiprocessing import Pool
from itertools import islice
import sys
headers = {'User-Agent': 'Mozilla/5.0'}
letter = sys.argv[1]
number = int(sys.argv[2])
first_group = product(ascii_lowercase, repeat=2)
second_group = product(digits, repeat=3)
codeList = [''.join([''.join(k) for k in prod]) for prod in product([letter], first_group, second_group)]
async def fetch(url, session):
async with session.get(url, headers=headers) as response:
statusCode = response.status
if(statusCode == 200):
print("{} statusCode is {}".format(url, statusCode))
return await response.read()
async def bound_fetch(sem, url, session):
async with sem:
await fetch(url, session)
def getUrl(codeIdex):
return "https://www.blahblah.com/" + codeList[codeIdex] + ".png"
async def run(r):
tasks = []
sem = asyncio.Semaphore(1000)
async with ClientSession() as session:
for i in range(r):
task = asyncio.ensure_future(bound_fetch(sem, getUrl(i), session))
tasks.append(task)
responses = asyncio.gather(*tasks)
await responses
loop = asyncio.get_event_loop()
future = asyncio.ensure_future(run(number))
loop.run_until_complete(future)
答案 0 :(得分:0)
我不能留下评论,所以我在这里写。尝试为信号量限制设置一个较小的数字,我认为问题取决于您同时向网站提出了多少个请求。而且,如果您想最终获得所有请求的响应,则在尝试获取任何url时还需要捕获类似的错误。