我将如何编写一个函数(使用BeautifulSoup或其他函数)来替换一个HTML标记的所有实例。例如:
text = "<p>this is some text<p><bad class='foo' data-foo='bar'> with some tags</bad><span>that I would</span><bad>like to replace</bad>"
new_text = replace_tags(text, "bad", "p")
print(new_text) # "<p>this is some text<p><p class='foo' data-foo='bar'> with some tags</p><span>that I would</span><p>like to replace</p>"
我试过这个,但保留每个标签的属性是一个挑战:
def replace_tags(string, old_tag, new_tag):
soup = BeautifulSoup(string, "html.parser")
nodes = soup.findAll(old_tag)
for node in nodes:
new_content = BeautifulSoup("<{0}>{1}</{0}".format(
new_tag, node.contents[0],
))
node.replaceWith(new_content)
string = soup.body.contents[0]
return string
知道如何在汤中替换标签元素本身吗?或者,更好的是,有没有人知道一个库/实用程序函数能比我写的东西更强大地处理它?</ p>
谢谢!
答案 0 :(得分:2)
实际上它非常简单。您可以直接使用typedef std::integral_constant<int, 2> two_t
。
old_tag.name = new_tag
输出:
def replace_tags(string, old_tag, new_tag):
soup = BeautifulSoup(string, "html.parser")
for node in soup.findAll(old_tag):
node.name = new_tag
return soup # or return str(soup) if you want a string.
text = "<p>this is some text<p><bad class='foo' data-foo='bar'> with some tags</bad><span>that I would</span><bad>like to replace</bad>"
new_text = replace_tags(text, "bad", "p")
print(new_text)
每个代码都有一个名称,可以
<p>this is some text<p><p class="foo" data-foo="bar"> with some tags</p><span>that I would</span><p>like to replace</p></p></p>
访问:.name
如果您更改了标签的名称,则更改将反映在Beautiful Soup生成的任何HTML标记中:
tag.name # u'b'