我试图将html文件的美化打印保存到txt文件,但收到此错误消息:
Traceback (most recent call last):
File "prettyhtmlfiles.py", line 16, in <module>
file.write(soup.prettify())
UnicodeEncodeError: 'ascii' codec can't encode character u'\xbb' in position 8532: ordinal not in range(128)
如何解决这个问题?
我的代码:
import urllib2
import os
from bs4 import BeautifulSoup
import csv
url = "/home/sveisa/S141test/ayuki.html"
with open(url, 'r') as f:
data = f.read()
soup = BeautifulSoup(open('/home/sveisa/S141test/ayuki.html').read())
print(soup.prettify())
file = open("newfile.txt", "w")
file.write(soup.prettify())
答案 0 :(得分:2)
试试这个。它应该工作。
print >> file, (soup.prettify().encode('utf-8'))