尝试使用python写入文件时编码错误

时间:2018-05-09 16:53:32

标签: python-3.x encoding

以下是完整的脚本:

import requests
import bs4


res = requests.get('https://example.com')
soup = bs4.BeautifulSoup(res.text, 'lxml')
page_HTML_code = soup.prettify()

multiline_code = """{}""".format(page_HTML_code)

f = open("testfile.txt","w+")
f.write(multiline_code)
f.close()

所以我试图将整个Downloaded HTML写成文件,同时保持整洁干净。

我确实知道文本有问题,无法保存某些字符,但我不确定如何正确编码文本。

有人可以帮忙吗?

这是我将收到的错误消息

"C:\Location", line 16, in <module>
    f.write(multiline_code)
  File "C:\\Anaconda3\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u0421' in position 209: character maps to <undefined>

1 个答案:

答案 0 :(得分:1)

我做了一些挖掘,这有效:

import requests
import bs4


res = requests.get('https://example.com')

soup = bs4.BeautifulSoup(res.text, 'lxml')

page_HTML_code = soup.prettify()



multiline_code = """{}""".format(page_HTML_code)

#add the Encoding part when opening file and this did the trick
with open('testfile.html', 'w+', encoding='utf-8') as fb:
    fb.write(multiline_code)