我的代码看起来像这样:
def downloadImages(self, path):
for link in self.images: #self.images=list of links opened via requests & written to file
if stuff in link:
#parse link a certain way to get `name`
else:
#parse link a different way to get `name`
r = requests.get(link)
with open(path+name,'wb') as f:
f.write(r.content)
pool = Pool(2)
scraper.prepPage(url)
scraper.downloadImages('path/to/directory')
我想更改downloadImages
以并行化该功能。这是我的尝试:
def downloadImage(self, path, index of self.images currently being processed):
if stuff in link: #because of multiprocessing, there's no longer a loop providing a reference to each index...the index is necessary to parse and complete the function.
#parse link a certain way to get `name`
else:
#parse link a different way to get `name`
r = requests.get(link)
with open(path+name,'wb') as f:
f.write(r.content)
pool = Pool(2)
scraper.prepPage(url)
pool.map(scraper.downloadImages('path/to/directory', ***some way to grab index of self.images****), self.images)
如何引用当前正在处理传递到pool.map()
的迭代的索引?
我对多处理和新功能完全陌生。无法在文档中找到我要找的内容......我也无法通过google或stackoverflow找到类似的问题。