嗨,大家好, 我是新的python(即时开始不到2周) 所以我需要一些建议和窍门:p
获取大约1500 api请求的最快最有效的方法是什么?
目前它正在为我工作,但需要8s才能执行1400 api请求但是当我尝试一个没有线程的请求时需要9s 我做错了什么??!
async def fetch_one(url):
async with curio_http.ClientSession() as session:
response = await session.get(url)
content = await response.json()
return content
async def fetchMultiURLs(url_list):
tasks = []
responses = []
for url in url_list:
task = await curio.spawn(fetch_one(url))
tasks.append(task)
for task in tasks:
content = await task.join()
responses.append(content)
print(content)
def MultiFetch(URLS,X):
MyThreadsList = []
MyThreadsResults = []
N_Threads = (lambda x: int (x/X) if (x % X == 0) else int(x/X)+1) (len(URLS))
for i in range( N_Threads ): # will iterate X = ListSize / X
MyThreadsList.append( Thread( target = curio.run , args = (fetchMultiURLs( (URLS[ i*X:(X*i+X)]) ) ,) ) )
MyThreadsList[i].start()
for i in range( N_Threads ):
MyThreadsResults.append(MyThreadsList[i].join())
return MyThreadsResults
答案 0 :(得分:0)
最终我找到了一个解决方案:)获取1400个网址需要2.2秒
我使用了3ed建议(进程内的异步循环)
#Fetch 1 URL
async def fetch_one(url):
async with curio_http.ClientSession() as session:
response = await session.get(url)
content = await response.json()
return content
#抓取X网址 async def fetchMultiURLs(url_list): tasks = [] 回复= [] 对于url_list中的url: task = await curio.spawn(fetch_one(url)) tasks.append(任务)
for task in tasks:
content = await task.join()
responses.append(content)
return responses
#我试图放置lambda而不是这个函数,但它不起作用
def RuningCurio(X):
return curio.run(fetchMultiURLs(X))
#创建进程和异步循环,具体取决于URL / X URL by Loop
#在我的情况下(我正在使用VPS)单个进程可以在不到1秒的时间内轻松获取700个链接,因此不要在此数量的URL下进行多进程(只需使用fetchMultiURLs函数) < / p>
def MultiFetch(URLS,X):
MyListofLists = []
LengthURLs = len(URLS)
N_Process = int (LengthURLs / X) if ( LengthURLs % X == 0) else int( LengthURLs / X) + 1
for i in range( N_Process ): # Create a list of lists ( [ [1,2,3],[4,5,6],[7,8,9] ] )
MyListofLists.append(URLS[ i*X:(X*i+X)])
P = Pool( N_Process)
return P.map( RuningCurio ,MyListofLists)
#im在1.1s中获取2100个网址我希望此解决方案可以帮助你们