UnicodeEncodeError 'charmap' codec can't encode characters in position 1-12
我尝试将缅甸语中的字符串粘贴到Jinja2模板中并保存模板时出现此错误。我在操作系统中安装了所有需要的字体,尝试使用codec
lib。 psocess:python脚本用数据解析CSV文件,然后创建一个字典,然后用这个字典用值填充Jinja2模板中使用的变量。写入文件时出现错误。使用Python 3.4。有一个名为python-myanmar
的软件包,但它适用于2.7,我不想降级我自己的代码。
已经阅读了所有这些:http://www.unicode.org/notes/tn11/,http://chimera.labs.oreilly.com/books/1230000000393/ch02.html#_discussion_31,https://code.google.com/p/python-myanmar/包和已安装的系统字体。我可以将字符串编码为.encode('utf-8')
,但不能.decode()
错误!问题是:我怎么能不降级代码,也许安装额外的东西,但最好只使用python 3.4嵌入式函数将数据写入文件?
C:\Users\...\autocrm.py in create_templates(csvfile_location, csv_delimiter, template_location, count
ies_to_update, push_onthefly, csv_gspreadsheet, **kwargs)
270 ### use different parsers for ventures due to possible difference in website design
271 ### checks if there is a link in CSV/TSV
--> 272 if variables['promo_link'] != '':
273 article_values = soup_the_newsletter_article(variables['promo_link'])
274 if variables['item1_link'] != '':
C:\Users\...\autocrm.py in push_to_ums(countries_to_update, html_template, **kwargs)
471 ### save to import.xml
472 with open(xml_path_upload, 'w') as writefile:
--> 473 writefile.write(template.render(**values))
474 print('saved the import.xml')
475
C:\Python34\lib\encodings\cp1252.py in encode(self, input, final)
17 class IncrementalEncoder(codecs.IncrementalEncoder):
18 def encode(self, input, final=False):
---> 19 return codecs.charmap_encode(input,self.errors,encoding_table)[0]
20
21 class IncrementalDecoder(codecs.IncrementalDecoder):
UnicodeEncodeError: 'charmap' codec can't encode characters in position 6761-6772: character maps to <undefined>
BTW,如果我的sys.getdefaultencoding()
输出为UTF8,为什么它指向cp1251.py?
with open(template_location, 'r') as raw_html:
template = Template(raw_html.read())
print('writing to template: ' + variables['country_id'])
# import ipdb;ipdb.set_trace()
with open('rendered_templates_L\\NL_' +
variables['country_id'] + ".html", 'w', encoding='utf-8') as writefile:
rendered_template = template.render(**alldata)
writefile.write(rendered_template)
答案 0 :(得分:0)
您打开输出文件时未指定编码,因此使用默认系统编码;这里是CP1251。
Jinja模板结果生成一个Unicode字符串,需要对其进行编码,但默认的系统编码不支持生成的代码点。
解决方案是选择一个明确的编解码器。如果您正在生成XML,UTF-8是默认编码,可以处理所有Unicode:
with open(xml_path_upload, 'w', encoding='utf8') as writefile:
writefile.write(template.render(**values))