BeautifulSoup循环未按顺序返回

时间:2016-11-28 16:48:11

标签: python beautifulsoup

返回的结果不是按顺序,我需要按顺序返回结果。

尝试记录排名。

def parse(self, response):
    sourceHtml = BeautifulSoup(response.body)
    soup = sourceHtml.find("dl", {"id": "resultList"})
    for link in soup.find_all('dd'):
        print(link.get('code'))

1 个答案:

答案 0 :(得分:1)

如果您想在列表中打印“代码”,只需使用"list comprehension"

def parse(self, response):
    sourceHtml = BeautifulSoup(response.body)
    soup = sourceHtml.find("dl", {"id": "resultList"})
    return [link.get('code') for link in soup.find_all('dd')]

您还可以改进查找元素的方式并使用CSS selector

def parse(self, response):
    soup = BeautifulSoup(response.body)
    return [link.get('code') for link in soup.select('dl#resultList dd')]

provide an underlying parser explicitly

也是一个好主意
soup = BeautifulSoup(response.body, "html.parser")
# or soup = BeautifulSoup(response.body, "html5lib")
# or soup = BeautifulSoup(response.body, "lxml")