在我下面的脚本中,如果我拿出“return”语句并在那里放置“print”,那么我得到所有结果。但是,如果我按原样运行它,我只得到第一项。我的问题是如何在这种情况下使用“返回”获得所有结果,我的意思是,应该是什么过程?
这是脚本:
import requests
from lxml import html
main_link = "http://onlinelibrary.wiley.com/journal/10.1111/(ISSN)1467-6281/issues"
def abacus_scraper(main_link):
tree = html.fromstring(requests.get(main_link).text)
for titles in tree.cssselect("a.issuesInYear"):
title = titles.cssselect("span")[0].text
title_link = titles.attrib['href']
return title, title_link
print(abacus_scraper(main_link))
结果:
('2017 - Volume 53 Abacus', '/journal/10.1111/(ISSN)1467-6281/issues?activeYear=2017')
答案 0 :(得分:4)
一旦从函数返回,就退出for循环。
您应该在算盘中保留一个列表,并在每次迭代时附加到列表中。循环结束后,返回列表。
例如:
import requests
from lxml import html
main_link = "http://onlinelibrary.wiley.com/journal/10.1111/(ISSN)1467-6281/issues"
def abacus_scraper(main_link):
results = []
tree = html.fromstring(requests.get(main_link).text)
for titles in tree.cssselect("a.issuesInYear"):
title = titles.cssselect("span")[0].text
title_link = titles.attrib['href']
results.append([title, title_link])
return results
print(abacus_scraper(main_link))