Question

Python版本：2.7.3

文件名：测试雪人角色 - ☃ - .mp3

进行以下测试，其中没有一项成功。

>>> os.path.exist('test snowman character --☃--.mp3')
False
>>> os.path.exist(repr('test snowman character --☃--.mp3'))
False
>>> os.path.isfile('test snowman character --\\xe2\\x98\\x83--.mp3')
False
>>> os.path.isfile(r'test snowman character --\\xe2\\x98\\x83--.mp3')
False
>>> os.path.isfile('test snowman character --☃--.mp3'.decode('utf-8'))
False

试图用glob检索文件，即使测试失败。

目标是检测并将此文件复制到另一个文件夹，请告知。

Answer 1

使用unicode值;最好使用unicode转义序列：

os.path.isfile(u'test snowman character --\u2603--.mp3')

当您为其提供unicode路径时，Windows上的Python将使用正确的Windows API列出UTF16文件。

有关Python如何使用unicode与bytestring文件路径改变行为的更多信息，请参阅Python Unicode HOWTO。

Answer 2

Windows NTFS文件系统使用UTF-16（只需询问Martijn Pieters），请试试这个：

>>> os.path.exists(u'test snowman character --☃--.mp3'.encode("UTF-16"))

但首先要确保解释器的输入编码是正确的。 print repr(u'test snowman character --☃--.mp3')应输出：

u'test snowman character --\u2603--.mp3'

注意：我无法对此进行测试，因为Windows CMD不会让我输入雪人符号。无论如何，如果你只给它一个Unicode字符串，那么Python会做正确的事情，因此编码调用是多余的。总而言之，我建议Martijn Pieters'回答。

Answer 3

Literal Unicode字符串应该以{{1}}开头，尝试u'

如果你想使用转义序列os.path.exist(u'test snowman character --☃--.mp3')，就像ur'

一样

http://docs.python.org/2.7/reference/lexical_analysis.html#strings

在Windows中使用Unicode字符检测文件名

3 个答案: