我有一个带有“code”标签的字符串,我想删除这些标签以及这些标签内的所有内容。例如,
"hello how are <code>this is my code</code> you"
变为
"hello how are you"
我很确定BeautifulSoup是适合这项工作的工具,不过我已经查看了文档,我无法弄清楚如何做到这一点。
由于
答案 0 :(得分:4)
轻松Tag.extract()
:
>>> from bs4 import BeautifulSoup as BS
>>> s = "hello how are <code>this is my code</code> you"
>>> soup = BS(s)
>>> codetags = soup.find_all('code')
>>> for codetag in codetags:
... codetag.extract()
>>> print soup
hello how are you