Question

我们正在用i18n编写一个复杂的网站。为了使翻译更容易，我们在模型中保存翻译。我们的员工通过django-admin编写和编辑翻译。翻译完成后，将启动一个管理脚本，该脚本会写入po文件并在之后执行djangos compilemessages。我知道，po文件必须使用utf-8编写。但在打开应用程序后，我仍然收到错误＆＃34;＆＃39; ascii＆＃39;编解码器不能解码位置1中的字节0xc3：序数不在范围（128）＆＃34;当使用具有西班牙语或frensh等特殊字符的语言时。我做错了什么？

这是我的（缩短的）代码：

class Command(NoArgsCommand):

def handle_noargs(self, **options):

    languages = XLanguage.objects.all()
    currPath = os.getcwd()

    for lang in languages:

        path = "{}/framework/locale/{}/LC_MESSAGES/".format(currPath, lang.langToplevel)

        # check and create path
        create_path(path)

        # add filename
        path = path + "django.po"

        with codecs.open(path, "w", encoding='utf-8') as file:

            # select all textitems for this language from XTranslation

            translation = XTranslation.objects.filter(langID=lang)

            for item in translation:

                    # check if menu-item
                    if item.textID.templateID:
                        msgid = u"menu_{}_label".format(item.textID.templateID.id)
                    else:
                        msgid = u"{}".format (item.textID.text_id)

                    trans = u"{}".format (item.textTranslate)

                    text = u'msgid "{}"      msgstr "{}"\n'.format(msgid, trans)

                file.write(text)


        file.close()

回溯：

Environment:

Request Method: GET
Request URL: http://127.0.0.1:8000/

Django Version: 1.7
Python Version: 3.4.0
Installed Applications:
('django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'simple_history',
'datetimewidget',
'payroll',
'framework',
'portal',
'pool',
'billing')
Installed Middleware:
('django.contrib.sessions.middleware.SessionMiddleware',
'django.middleware.common.CommonMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.auth.middleware.SessionAuthenticationMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
'django.middleware.clickjacking.XFrameOptionsMiddleware',
'simple_history.middleware.HistoryRequestMiddleware')


Traceback:
File "c:\python34\lib\site-packages\django\core\handlers\base.py" in get_response
  111. response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "E:\python\sarlex\framework\views.py" in init
   34. activate("de")
File "c:\python34\lib\site-packages\django\utils\translation\__init__.py" in activate
  145. return _trans.activate(language)
File "c:\python34\lib\site-packages\django\utils\translation\trans_real.py" in activate
  225. _active.value = translation(language)
File "c:\python34\lib\site-packages\django\utils\translation\trans_real.py" in translation
  210. current_translation = _fetch(language, fallback=default_translation)
File "c:\python34\lib\site-packages\django\utils\translation\trans_real.py" in _fetch
  195. res = _merge(apppath)
File "c:\python34\lib\site-packages\django\utils\translation\trans_real.py" in _merge
  177. t = _translation(path)
File "c:\python34\lib\site-packages\django\utils\translation\trans_real.py" in _translation
  159. t = gettext_module.translation('django', path, [loc], DjangoTranslation)
File "c:\python34\lib\gettext.py" in translation
  410. t = _translations.setdefault(key, class_(fp))
File "c:\python34\lib\site-packages\django\utils\translation\trans_real.py" in __init__
  107. gettext_module.GNUTranslations.__init__(self, *args, **kw)
File "c:\python34\lib\gettext.py" in __init__
  160. self._parse(fp)
File "c:\python34\lib\gettext.py" in _parse
  300. catalog[str(msg, charset)] = str(tmsg, charset)

Exception Type: UnicodeDecodeError at /
Exception Value: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)

Answer 1

每当您遇到编码/解码错误时，都表示您正在错误地处理Unicode。这通常是在您将Unicode与字节字符串混合时，这将提示Python 2.x使用默认编码'ascii'将您的字节字符串隐式解码为Unicode，这就是为什么会出现以下错误：

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)

避免这些错误的最佳方法是在程序中使用ONLY Unicode，即必须使用'utf-8'（或您选择的其他Unicode编码）将所有输入字节字符串显式解码为Unicode，并标记代码中的字符串类型为Unicode，前缀为u''。当您明确地写出文件时，使用'utf-8'将这些字符串编码回字节字符串。

特别是对你的代码，我猜是

msgid = "menu_{}_label".format(item.textID.templateID.id)

或

text = 'msgid "{}"      msgstr "{}"\n'.format(msgid, item.textTranslate)

正在抛出错误。尝试通过声明它们来msgid和text Unicode字符串而不是字节字符串：

msgid = u"menu_{}_label".format(item.textID.templateID.id)

和

text = u'msgid "{}"      msgstr "{}"\n'.format(msgid, item.textTranslate)

我假设item.textID.templateID.id和item.textTranslate的值都是Unicode。如果它们不是（即它们是字节串），则必须先解码它们。

最后，这是关于如何在Python中处理Unicode的非常好的演示文稿：http://nedbatchelder.com/text/unipain.html。如果你做了很多i18n工作，我强烈建议你去做。

编辑1：由于item.textID.templateID.id和item.textTranslate是字节字符串，因此您的代码应为：

for item in translation:
    # check if menu-item
    if item.textID.templateID:
        msgid = u"menu_{}_label".format(item.textID.templateID.id.decode('utf-8'))
    else:
        msgid = item.textID.text_id.decode('utf-8')  # you don't need to do u"{}".format() here since there's only one replacement field

    trans = item.textTranslate.decode('utf-8')  # same here, no need for u"{}".format()
    text = u'msgid "{}"      msgstr "{}"\n'.format(msgid, trans)  # msgid and trans should both be Unicode at this point
    file.write(text)

编辑2：原始代码在Python 3.x中，所以上述所有内容都不适用。

Answer 2

我遇到了同样的错误，这对我有帮助https://stackoverflow.com/a/23278373/2571607

基本上，对我来说，这是python的一个问题。我的解决方案是，打开C：\ Python27 \ Lib \ mimetypes.py

替换

‘default_encoding = sys.getdefaultencoding()’

与

if sys.getdefaultencoding() != 'gbk':  
    reload(sys)  
    sys.setdefaultencoding('gbk')  
default_encoding = sys.getdefaultencoding()

Answer 3

发现了解决方案！我正在用空格分隔的一行中编写msgid和msgstr，以使其更具可读性。这适用于英语，但在使用西班牙语或frensh等特殊字符的语言中引发错误。将msgid和msgstr写成2行之后就可以了。

使用i18n Python 3的django问题

3 个答案: