Python将日语输入转换为问号

时间:2013-04-21 21:13:01

标签: python filenames

我正在尝试编写一个程序来获取文件名,并通过更改单词的顺序重命名它们。它适用于大多数文件,但我有一些文件名中包含日文字符的文件,程序无法正常工作。我认为这是因为它将字符转换为问号(我已使用print检查过),然后无法找到该文件,因为文件中包含日文字符,而不是问号。我该如何解决这个问题?

编辑: 是的,我正在使用Windows。

我的代码重新发布在下面(我对此很新,所以它可能非常低效且难以阅读)。

import os

def Filenames(filelist):
    filenames = []
    for name in filelist:
        name = name.split(".") #Take off file extension
        filenames.append(name)
    return filenames

def ReformatName(directory):
    filelist = []
    name = []    

    filelist = os.listdir(directory)
    filenames = Filenames(filelist)

    for doc in filenames: #Docs are in form "Date Name Subject DocName", want to turn into "Subject DocName Date"
        doc1 = doc.split(" ")
        date = doc1[0]
        subject = doc1[2]
        docname = doc1[3]

        newdoc = "%s %s %s.docx" %(subject, docname, date)
        doc = ".".join(doc)
        os.rename(os.path.normpath(directory + os.sep + doc), os.path.normpath(directory + os.sep + newdoc))

1 个答案:

答案 0 :(得分:1)

我发现了一个非常复杂的Windows控制台问题解决方案:

# -*- coding: utf-8 -*-
import sys
import codecs

def setup_console(sys_enc="utf-8"):
    reload(sys)
    # Calling a system library function if we're using win32
    if sys.platform.startswith("win"):
        import ctypes
        enc = "cp%d" % ctypes.windll.kernel32.GetOEMCP() #TODO: check on win64/python64
    else:
        # It seems like for Linux everything already exists
        enc = (sys.stdout.encoding if sys.stdout.isatty() else
                    sys.stderr.encoding if sys.stderr.isatty() else
                        sys.getfilesystemencoding() or sys_enc)

    # Encoding for sys
    sys.setdefaultencoding(sys_enc)

    # Redefining standard output streams if they aren't redirected
    if sys.stdout.isatty() and sys.stdout.encoding != enc:
        sys.stdout = codecs.getwriter(enc)(sys.stdout, 'replace')

    if sys.stderr.isatty() and sys.stderr.encoding != enc:
        sys.stderr = codecs.getwriter(enc)(sys.stderr, 'replace')

来源:http://habrahabr.ru/post/117236/(仅提供俄语版)