如何在beautifulsoup findAll的类之间使用OR运算符?

时间:2019-10-07 08:04:25

标签: python beautifulsoup

我在解析html时遇到问题。我正在使用一个网站,该网站的列表中有一些具有不同类名的项目。我想做的就是在单个findAll中找到它们,就像这样:

page_soup.findAll("li", {"Class" : "Class1" or "Class2"})

我想在班级之间使用“或”。

示例html:

<ol class="products-list" id="products">
    <li class="item odd">
    </li>
    <li class="item even">
    </li>
    <li class="item last even">
    </li>
</ol>

2 个答案:

答案 0 :(得分:1)

使用比Select()更快的findAll()

page_soup=BeautifulSoup(html,'html.parser')
for item in page_soup.select(".odd,.even"):
    print(item)

此处的代码

from bs4 import BeautifulSoup
html='''<ol class="products-list" id="products">
    <li class="item odd">
    </li>
    <li class="item even">
    </li>
    <li class="item last even">
    </li>
</ol>
'''

page_soup=BeautifulSoup(html,'html.parser')
for item in page_soup.select(".odd,.even"):
    print(item)

答案 1 :(得分:0)

完整的工作样本:

from bs4 import BeautifulSoup

text = """
<body>
    <ul>
        <li class="Class1">Class 1</li>
        <li class="Class2">Class 2</li>
        <div class="Class1 special">Class 1 in div</div>
        <div class="Class2 special">Class2 in div</div>
    </ul>
</body>"""

soup = BeautifulSoup(text,"lxml")
result = soup.find_all(lambda tag: tag.name == 'li' and  
( tag.get('class') == ['Class1'] or tag.get('class') == ['Class2'] ))

print(result)