python - 删除某些标记内的文本

时间:2013-09-14 23:18:01

标签: python beautifulsoup

我有一个带有“code”标签的字符串,我想删除这些标签以及这些标签内的所有内容。例如,

"hello how are <code>this is my code</code> you"

变为

"hello how are you"

我很确定BeautifulSoup是适合这项工作的工具,不过我已经查看了文档,我无法弄清楚如何做到这一点。

由于

1 个答案:

答案 0 :(得分:4)

轻松Tag.extract()

>>> from bs4 import BeautifulSoup as BS
>>> s = "hello how are <code>this is my code</code> you"
>>> soup = BS(s)
>>> codetags = soup.find_all('code')
>>> for codetag in codetags:
...    codetag.extract()
>>> print soup
hello how are  you