我正在尝试打开一个文件而我刚刚意识到py的用户名有问题(用俄语)。关于如何正确解码/编码以使空闲的快乐的任何建议?
我正在使用py 2.6.5
xmlfile = open(u"D:\\Users\\Эрик\\Downloads\\temp.xml", "r")
Traceback (most recent call last):
File "<pyshell#23>", line 1, in <module>
xmlfile = open(str(u"D:\\Users\\Эрик\\Downloads\\temp.xml"), "r")
UnicodeEncodeError: 'ascii' codec can't encode characters in position 9-12: ordinal not in range(128)
os.sys.getfilesystemencoding() 'MBCS'
xmlfile = open(u“D:\ Users \Эрик\ Downloads \ temp.xml”.encode(“mbcs”),“r”)
追踪(最近一次通话): 文件“”,第1行,in xmlfile = open(u“D:\ Users \Эрик\ Downloads \ temp.xml”.encode(“mbcs”),“r”) IOError:[Errno 22]无效模式('r')或文件名:'D:\ Users \ Y?ee \ Downloads \ temp.xml'
答案 0 :(得分:0)
第一个问题是解析器尝试解释字符串中的反斜杠,除非您使用r"raw quote"
前缀。在2.6.5中,您不需要特别处理您的Unicode字符串,但您可能需要在源代码中使用文件编码声明,如:
# -*- coding: utf-8 -*-
在PEP 263中定义。以下是交互式工作的示例:
$ python
Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56) [GCC 4.4.3] on linux2
>>> f = r"D:\Users\Эрик\Downloads\temp.xml"
>>> f
'D:\\Users\\\xd0\xad\xd1\x80\xd0\xb8\xd0\xba\\Downloads\\temp.xml'
>>> x = open(f, 'w')
>>> x.close()
>>>
$ ls D*
D:\Users\Эрик\Downloads\temp.xml
是的,这是在Unix系统上,所以\
没有意义,我的终端编码是utf-8,但它有效。您可能必须在解析文件时将编码提示提供给解析器。
答案 1 :(得分:0)
第一个问题:
xmlfile = open(u"D:\\Users\\Эрик\\Downloads\\temp.xml", "r")
### The above line should be OK, provided that you have the correct coding line
### For example # coding: cp1251
Traceback (most recent call last):
File "<pyshell#23>", line 1, in <module>
xmlfile = open(str(u"D:\\Users\\Эрик\\Downloads\\temp.xml"), "r")
### HOWEVER the above traceback line shows you actually using str()
### which is DIRECTLY causing the error because it is attempting
### to decode your filename using the default ASCII codec -- DON'T DO THAT.
### Please copy/paste; don't type from memory.
UnicodeEncodeError: 'ascii' codec can't encode characters in position 9-12: ordinal not in range(128)
第二个问题:
os.sys.getfilesystemencoding()
生成'mbcs'
xmlfile = open(u"D:\Users\Эрик\Downloads\temp.xml".encode("mbcs"), "r")
### (a) \t is interpreted as a TAB character, hence the file name is invalid.
### (b) encoding with mbcs seems not to be useful; it messes up your name ("Y?ee").
Traceback (most recent call last):
File "", line 1, in xmlfile = open(u"D:\Users\Эрик\Downloads\temp.xml".encode("mbcs"), "r")
IOError: [Errno 22] invalid mode ('r') or filename: 'D:\Users\Y?ee\Downloads\temp.xml'
有关Windows中硬编码文件名的一般建议,按优先级降序排列:
(1)不要
(2)使用/
例如"c:/temp.xml"
(3)使用带有反斜杠的原始字符串r"c:\temp.xml"
(4)使用加倍的反斜杠"c:\\temp.xml"