如何使用Beautiful Soup复制和替换Python中的元素及其子元素?

时间:2018-05-28 13:55:40

标签: python html beautifulsoup

所以,我有两个HTML文件,它们都有一个ID为htmlbody的div。我想检查一个文件中的htmlbody元素是否与另一个文件中的htmlbody元素相同。如果它不是那么我想复制htmlbody元素并将其替换为不同的文件。请参阅下面的代码。

我尝试过使用修改树文档https://www.crummy.com/software/BeautifulSoup/bs4/doc/#append

import codecs
from bs4 import BeautifulSoup 

def getMainFile():
    #opens and pareses the main html file
    main_html = codecs.open("index.html", 'r')
    soup = BeautifulSoup(main_html, 'html.parser')
    #assignes the HTML content of the main file to a variable.
    html_content = soup.find(id="htmlbody")
    return html_content

#User Html file
  def getUserFile():
     user_html = codecs.open("userone.html", 'r')
     soup = BeautifulSoup(user_html, 'html.parser')
     soup.prettify()
     html_content = soup.find(id="htmlbody")
     return html_content


 #Checks files
 if getMainFile() == getUserFile():
    print("all good")
 else:
    new_content = getMainFile()
    user_html = codecs.open("userone.html", 'r')
    soup = BeautifulSoup(user_html, 'html.parser')

  with open("userone.html", "w") as file:
      file.write(str(soup.prettify()))

0 个答案:

没有答案