如何使用python BeautifulSoup4用ID替换类

时间:2018-12-12 12:08:29

标签: python python-2.7 beautifulsoup

我有一个HTML内容,其中需要替换这两个div的类以id并使用beautifulsoup4将它们包装在具有另一个id的包装div中。

输入:

<div class="section1">section one content</div>
<div class="section2">section two content</div>

输出:

<div id="section-wrapper">
<div id="section1">section one content</div><div id="section2">section two content</div>
</div>

1 个答案:

答案 0 :(得分:0)

有几种方法可以做到这一点。一种方法涉及使用.new_tag()并重置每个节的.attrs值:

from bs4 import BeautifulSoup


data = """
    <div class="section1">section one content</div>
    <div class="section2">section two content</div>
"""

soup = BeautifulSoup(data, "html.parser")

wrapper = soup.new_tag("div", {"id": "section-wrapper"})
for section in soup.select("[class^=section]"):
    section.attrs = {"id": section["class"]}

    wrapper.append(section)

print(wrapper.prettify())

打印:

<div>
 <div id="section1">
  section one content
 </div>
 <div id="section2">
  section two content
 </div>
</div>