Python函数用另一个

时间:2018-03-09 22:58:40

标签: python html beautifulsoup

我将如何编写一个函数(使用BeautifulSoup或其他函数)来替换一个HTML标记的所有实例。例如:

text = "<p>this is some text<p><bad class='foo' data-foo='bar'> with some tags</bad><span>that I would</span><bad>like to replace</bad>"
new_text = replace_tags(text, "bad", "p")
print(new_text)  # "<p>this is some text<p><p class='foo' data-foo='bar'> with some tags</p><span>that I would</span><p>like to replace</p>"

我试过这个,但保留每个标签的属性是一个挑战:

def replace_tags(string, old_tag, new_tag):
  soup = BeautifulSoup(string, "html.parser")
  nodes = soup.findAll(old_tag)
  for node in nodes:
      new_content = BeautifulSoup("<{0}>{1}</{0}".format(
          new_tag, node.contents[0],
      ))  
      node.replaceWith(new_content)                                                
  string = soup.body.contents[0]
  return string

知道如何在汤中替换标签元素本身吗?或者,更好的是,有没有人知道一个库/实用程序函数能比我写的东西更强大地处理它?<​​/ p>

谢谢!

1 个答案:

答案 0 :(得分:2)

实际上它非常简单。您可以直接使用typedef std::integral_constant<int, 2> two_t

old_tag.name = new_tag

输出:

def replace_tags(string, old_tag, new_tag):
    soup = BeautifulSoup(string, "html.parser")
    for node in soup.findAll(old_tag):
        node.name = new_tag
    return soup  # or return str(soup) if you want a string.

text = "<p>this is some text<p><bad class='foo' data-foo='bar'> with some tags</bad><span>that I would</span><bad>like to replace</bad>"
new_text = replace_tags(text, "bad", "p")
print(new_text)

来自documentation

  

每个代码都有一个名称,可以<p>this is some text<p><p class="foo" data-foo="bar"> with some tags</p><span>that I would</span><p>like to replace</p></p></p> 访问:

.name
     

如果您更改了标签的名称,则更改将反映在Beautiful Soup生成的任何HTML标记中:

tag.name
# u'b'