我有一个HTML内容,其中需要替换这两个div的类以id并使用beautifulsoup4将它们包装在具有另一个id的包装div中。
输入:
<div class="section1">section one content</div>
<div class="section2">section two content</div>
输出:
<div id="section-wrapper">
<div id="section1">section one content</div><div id="section2">section two content</div>
</div>
答案 0 :(得分:0)
有几种方法可以做到这一点。一种方法涉及使用.new_tag()
并重置每个节的.attrs
值:
from bs4 import BeautifulSoup
data = """
<div class="section1">section one content</div>
<div class="section2">section two content</div>
"""
soup = BeautifulSoup(data, "html.parser")
wrapper = soup.new_tag("div", {"id": "section-wrapper"})
for section in soup.select("[class^=section]"):
section.attrs = {"id": section["class"]}
wrapper.append(section)
print(wrapper.prettify())
打印:
<div>
<div id="section1">
section one content
</div>
<div id="section2">
section two content
</div>
</div>