使用BeautifulSoup删除特定类的div

时间:2015-08-18 05:10:59

标签: python python-2.7 beautifulsoup

我想从div对象中删除特定的soup
我正在使用python 2.7bs4

根据文档,我们可以使用div.decompose()

但这会删除所有div。如何删除具有特定类别的div

4 个答案:

答案 0 :(得分:32)

当然,您可以通常的方式selectfindfind_all感兴趣的div,然后致电decompose()的div。

例如,如果要删除所有带有sidebar类的div,可以使用

执行此操作
# replace with `soup.findAll` if you are using BeautifulSoup3
for div in soup.find_all("div", {'class':'sidebar'}): 
    div.decompose()

如果您要删除具有特定id的div,例如main-content,则可以使用

删除
soup.find('div', id="main-content").decompose()

答案 1 :(得分:7)

这将对您有所帮助:

from bs4 import BeautifulSoup

markup = '<a>This is not div <div class="1">This is div 1</div><div class="2">This is div 2</div></a>'
soup = BeautifulSoup(markup,"html.parser")
a_tag = soup

soup.find('div',class_='2').decompose()

print a_tag

输出:

<a>This is not div <div class="1">This is div 1</div></a>

如果有帮助请告诉我

答案 2 :(得分:3)

希望有所帮助:

from bs4 import BeautifulSoup
from bs4.element import Tag

markup = '<a>This is not div <div class="1">This is div 1</div><div class="2">This is div 2</div></a>'
soup = BeautifulSoup(markup,"html.parser")

for tag in soup.select('div.1'):
  tag.decompose()

print(soup)

答案 3 :(得分:-1)

    from BeautifulSoup import BeautifulSoup
    >>> soup = BeautifulSoup('<body><div>1</div><div class="comment"><strong>2</strong></div></body>')
    >>> for div in soup.findAll('div', 'comment'):
    ...   div.extract()
    ... 
    <div class="comment"><strong>2</strong></div>
    >>> soup
    <body><div>1</div></body>