返回的结果不是按顺序,我需要按顺序返回结果。
尝试记录排名。
def parse(self, response):
sourceHtml = BeautifulSoup(response.body)
soup = sourceHtml.find("dl", {"id": "resultList"})
for link in soup.find_all('dd'):
print(link.get('code'))
答案 0 :(得分:1)
如果您想在列表中打印“代码”,只需使用"list comprehension":
def parse(self, response):
sourceHtml = BeautifulSoup(response.body)
soup = sourceHtml.find("dl", {"id": "resultList"})
return [link.get('code') for link in soup.find_all('dd')]
您还可以改进查找元素的方式并使用CSS selector:
def parse(self, response):
soup = BeautifulSoup(response.body)
return [link.get('code') for link in soup.select('dl#resultList dd')]
provide an underlying parser explicitly:
也是一个好主意soup = BeautifulSoup(response.body, "html.parser")
# or soup = BeautifulSoup(response.body, "html5lib")
# or soup = BeautifulSoup(response.body, "lxml")