使用python将Unicode字符串存储到文件中

时间:2018-11-26 09:00:02

标签: python

临时XML

<?xml version="1.0" encoding="utf-8"?>
<PubmedArticleSet>
    <LastName>Nalivaĭko</LastName>
    <ForeName>Anthony V</ForeName>
</PubmedArticleSet>

我的密码

import xml.dom.minidom


doc = xml.dom.minidom.parse("temp.xml");
file = open('output1.xml','w')

articles = doc.getElementsByTagName('PubmedArticleSet')
for art in articles:
    ln = art.getElementsByTagName("LastName")[0]
    data = ln.firstChild.nodeValue
    file.write("<LastName>")
    file.write(data)
    file.write("</LastName>\n")
print("Completed")
file.close()

我需要的输出与LastName标记中的String相同。

必需的输出-Nalivaĭko

我在运行代码时遇到此错误

Traceback (most recent call last):
  File "C:\Users\Yugam\Desktop\python\ParsingUsingDOM.py", line 12, in <module>
    file.write(data)
  File "C:\Users\Yugam\AppData\Local\Programs\Python\Python37-32\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u012d' in position 6: character maps to <undefined>

1 个答案:

答案 0 :(得分:0)

您可以使用所需的编码打开文件进行写入,如下所示:

open('output1.xml','w', encoding='utf-8')

然后,您可以照常写出unicode字符串。

输出文件:

<LastName>Nalivaĭko</LastName>