Question

我的文件名为"SSE-Künden, SSE-Händler.pdf"，当我在python解释器上打印此文件名时，这两个unicode char ( ü,ä)的unicode值将被转换为相应的ascii值我猜'SSE-K\x81nden, SSE-H\x84ndler.pdf'但是我想

test dir包含名称为“SSE-Künden，SSE-Händler.pdf”的pdf文件

我试过这个： path ='C：\ test' 对于os.walk（路径）中的a，b，c：打印c

['SSE-K\x81nden, SSE-H\x84ndler.pdf']

我如何将此ascii字符转换为其各自的unicode vals，并且我想在解释器上显示原始名称（"SSE-Künden, SSE-Händler.pdf"），并且还要写入一些文件。我是否实现了这一点。我使用的是Python 2.6和Windows操作系统。

感谢。

Answer 1

假设您的终端支持显示字符，请迭代文件列表并单独打印（或使用Python 3，在列表中显示Unicode）：

Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> for p,d,f in os.walk(u'.'):
...  for n in f:
...   print n
...
SSE-Künden, SSE-Händler.pdf

另请注意，我使用了Unicode字符串（u'。'）作为路径。这指示os.walk返回Unicode字符串而不是字节字符串。处理非ASCII文件名时，这是一个好主意。

在Python 3中，字符串默认为Unicode，非ASCII字符显示给用户，而不是显示为转义码：

Python 3.2.1 (default, Jul 10 2011, 21:51:15) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> for p,d,f in os.walk('.'):
...  print(f)
...
['SSE-Künden, SSE-Händler.pdf']

Answer 2

for a,b,c in os.walk(path):
    for n in c:
        print n.decode('utf-8')

Answer 3

写入文件：http://docs.python.org/howto/unicode.html#reading-and-writing-unicode-data

在python字符串中处理ascii char

3 个答案: