使用BeautifulSoup4选择具有类且没有属性的所有元素

时间:2015-11-17 04:03:10

标签: python beautifulsoup

使用BeautifulSoup4,我可以使用以下选项选择所有需要的元素:

elements = soup.find_all('a', {'class': 'some-class'})

如何限制elements仅包含类some-class但没有href="#"等属性的锚链接?

1 个答案:

答案 0 :(得分:1)

使用href指定None

>>> from bs4 import BeautifulSoup
>>> 
>>> soup = BeautifulSoup('''
... <div>
...     <a class="some-class" href="#">11</a>
...     <a class="some-class">22</a>
...     <a class="some-class">33</a>
...     <a class="some-class" href="#">44</a>
... </div>
... ''')
>>> soup.find_all('a', {'class': 'some-class'})
[<a class="some-class" href="#">11</a>, <a class="some-class">22</a>,
 <a class="some-class">33</a>, <a class="some-class" href="#">44</a>]
>>> soup.find_all('a', {'class': 'some-class', 'href': None})  # <--
[<a class="some-class">22</a>, <a class="some-class">33</a>]