from bs4 import BeautifulSoup
html = """
<div class="aa bb"></div>
<div class="aa ccc"></div>
<div class="aa"></div>
"""
def find(aclass):
print(aclass)
return aclass != "bb"
soup = BeautifulSoup(html, 'lxml')
div = soup.find_all('div', attrs={'class': find})
print(div)
我只想要class ='aa',而不是'aa bb'或其他任何人。 请帮我! 谢谢!
答案 0 :(得分:4)
这是一个答案 BeautifulSoup webscraping find_all( ): finding exact match
这将只为您提供带有'aa'类的标签。
div = soup.find_all(lambda tag: tag.name == 'div' and tag.get('class') == ['aa'])
答案 1 :(得分:2)
您还可以使用简单的CSS selector:
soup.select("div[class=aa]")
演示:
>>> from bs4 import BeautifulSoup
>>>
>>> html = """
... <div class="aa bb"></div>
... <div class="aa ccc"></div>
... <div class="aa"></div>
... """
>>> soup = BeautifulSoup(html, 'lxml')
>>>
>>> for elm in soup.select("div[class=aa]"):
... print(str(elm))
...
<div class="aa"></div>