我有一个简单的Python脚本循环遍历字典,其中键是url链接。我必须从每个链接中提取一些信息并将信息存储到另一个字典中。下面的代码是函数的第一部分,代码似乎按预期工作。 但它只打开了一个链接,而我认为如果我并行执行此操作,我可能会在运行时获得一些改进。您是否有任何建议如何在Python中以简单的方式实现这一目标?
def updater(local):
links = myItems['links']
for link in links.keys():
page = requests.get(link)
soup = BeautifulSoup(page.content, 'html.parser')
newsoup = soup.find("div", {"id": "overviewQuickstatsBenchmarkDiv"})
rows = newsoup.findAll('tr')[1]
counter = 0
date = ""
for td in rows.findAll('td'):
counter += 1
if td.contents[0] == 'Date':
date = td.text.replace("Date", "")
elif counter == 2:
pass
elif counter == 3:
price = re.findall("\d+\.\d+", td.string)[0]
这是我尝试使用多处理(但我无法得到任何结果,代码似乎无法运行):
def read(url):
result = {'link': url, 'data': requests.get(url)}
print "Reading: " + url
return result
def updater(local):
links = myItems['links']
pool = Pool(processes=5)
results = pool.map(read, links.keys())
for link in links.keys():
# need to read the results and store data into a dictionary