将Beautifulsoup写回原始HTML文件时出错

时间:2015-03-25 18:14:27

标签: python html windows beautifulsoup html-parsing

我的原始HTML文件的BeautifulSoup副本的编码是否存在问题?

我被告知我无法写入文件,因为我必须写一个str而不是没有。

请参阅以下代码和TypeError

#Manipulating HTML and saving changed with BeautifulSoup

#Importing libraries
from bs4 import BeautifulSoup

#Opening the local HTML file
site_html = open(r"C:\Users\rbaden\desktop\KPI_Site\index.html")

#Creating Soup  from source HTML file
soup =BeautifulSoup(site_html)
#print(soup.prettify())

#Locate and view specified class in HTML file
test = soup.find_all(class_='test-message-one')
print(test)

#Test place holder for a python variable that should replace the specified class
var = ('Testing...456')

#Replace the class in soup redition of HTML
for i in soup.find_all(class_='test-message-one'):
    i.string = var

#overwriting the source HTML file on local drive
with open(r"C:\Users\rbaden\desktop\KPI_Site\index.html") as f:
    f.write(soup.content)

enter image description here

1 个答案:

答案 0 :(得分:4)

首先,您需要以w模式打开文件。

而且,您需要撰写str(soup)soup.prettify()

with open(r"C:\Users\rbaden\desktop\KPI_Site\index.html", "w") as f:
    f.write(soup.prettify())