我想搜索div和<a> with specified class using beautifulsoup</a>

时间:2014-11-15 03:31:15

标签: python beautifulsoup html-parsing

大家好我试图使用Beautifulsoup findall立刻获得<div class="div1"> content </div><a class="a1"> link </a>
我只是倾向于beautifulsoup我知道这个方法如何链接soup.find_all("div",{ "class" : "div1" }):但是如何获得具有特定类的两个不同的标签。

我可以做点什么吗

for link in soup.find_all("div",{ "class" : "div1" } and "a",{"class" : "a1"}):

示例Html

 <div class="div1"> content </div>
 <div class="div2"> content </div>
 <div class="div3"> content </div>

 <a class="a1"> link </a>
 <a class="a2"> link </a>
 <a class="a2"> link </a>

我搜索了很多,但没有找到类似的东西 感谢

1 个答案:

答案 0 :(得分:1)

您可以提供list个类来搜索:

soup.find_all(class_=["div1", "a1"])

并且,您还可以传递标签列表以查找:

soup.find_all(["a", "div"], class_=["div1", "a1"])

演示:

>>> from bs4 import BeautifulSoup
>>> 
>>> data = """
... <div>
...     <div class="div1"> content1 </div>
...     <div class="div2"> content2 </div>
...     <div class="div3"> content3 </div>
... 
...     <a class="a1"> link1 </a>
...     <a class="a2"> link2 </a>
...     <a class="a2"> link3 </a>
... </div>
... """
>>> 
>>> soup = BeautifulSoup(data)
>>> soup.find_all(class_=["div1", "a1"])
[<div class="div1"> content1 </div>, <a class="a1"> link1 </a>]
>>>
>>> soup.find_all(["a", "div"], class_=["div1", "a1"])
[<div class="div1"> content1 </div>, <a class="a1"> link1 </a>]