我有一个图像链接列表(大约5000行),我需要知道如何快速下载所有内容。请帮我处理我的代码:
import concurrent.futures
import urllib.request
catname = 'amateur'
def getimg (count, endcount):
while (count < endcount):
urllib.request.urlretrieve(URLS[count], catname+'/images/'+catname+str(count)+'.jpg')
URLS[count] = catname+'/images/'+catname+str(count)+'.jpg'
count = count + 1
with concurrent.futures.ThreadPoolExecutor(max_workers=50) as e:
e.submit(getimg, 0, 5000)
它工作得很好但很慢。
答案 0 :(得分:2)
您的代码下载5000张图片50次。请尝试以下方法:
import concurrent.futures
import urllib.request
catname = 'amateur'
def getimg(count):
localpath = '{0}/images/{0}{1}.jpg'.format(catname, count)
urllib.request.urlretrieve(URLS[count], localpath)
URLS[count] = localpath
with concurrent.futures.ThreadPoolExecutor(max_workers=50) as e:
for i in range(5000):
e.submit(getimg, i)