使用名称相同的Beautiful Soup收集信息

时间:2018-10-15 18:46:24

标签: python beautifulsoup

我想用Python的Beautiful Soup从HTML页面中抓取信息,而我需要的所有信息都放在同一个名称标签中>如何区分我需要的每一个信息?

enter image description here 我需要的所有信息都在不同的class =“ hAyfc”标签中。

1 个答案:

答案 0 :(得分:1)

结果将是有序的。您只需要取出结果,因为结果的顺序与html中的顺序相同

from bs4 import BeautifulSoup

html = """
<div class = "hAyfc">
    <div class = "BgcNfc">pro </div>
    <span class = "htlgb">
        <div>
            <span class = "htlgb">
                codeA
            </span>
        </div>
    </span>
</div>

<div class = "hAyfc">
    <div class = "BgcNfc">pro </div>
    <span class = "htlgb">
        <div>
            <span class = "htlgb">
                codeB
            </span>
        </div>
    </span>
</div>
"""

bs = BeautifulSoup(html,"lxml")
result = [e.text for e in bs.find_all("div",{"class":"hAyfc"})]
print(result)