GNU gettext msgfilter程序说“无效的多字节序列”

时间:2018-06-06 18:19:01

标签: locale gettext

GNU gettext程序msgfilter似乎不接受UTF8字符串作为过滤器给出的脚本的结果。该脚本只返回从文件中读取的准备文本。

以下是测试设置:

echo '#!/bin/bash
cat /tmp/t3.txt
' > /tmp/trans01.sh
chmod a+rwx /tmp/trans01.sh

然后有一个文件/tmp/t3.txt:

cat /tmp/t3.txt

结果:

AMSTERDAM REISEFÜHRER FÜR REISE, UNTERKUNFT, SEHENSWÜRDIGKEITEN     

是utf-8文件:

file /tmp/t3.txt

给出:

/tmp/t3.txt: UTF-8 Unicode text

此外:

echo 'msgid "kk71ams_amsterdam_main_page_title"
msgstr "AMSTERDAM TOURIST GUIDE FOR TRAVEL, ACCOMMODATION, ATTRACTIONS"
' > /tmp/te1.po

比:

cat /tmp/te1.po

给出:

msgid "kk71ams_amsterdam_main_page_title"
msgstr "AMSTERDAM TOURIST GUIDE FOR TRAVEL, ACCOMMODATION, ATTRACTIONS"

比:

file /tmp/te1.po

给出:

/tmp/te1.po: GNU gettext message catalogue, ASCII text

区域设置:

:~# locale
LANG=
LANGUAGE=
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=

现在'msgfilter'的问题:

~# msgfilter -i /tmp/te1.po '/tmp/trans01.sh'
msgid "kk71ams_amsterdam_main_page_title"
/tmp/te1.po:2: invalid multibyte sequence
/tmp/te1.po:2: invalid multibyte sequence
/tmp/te1.po:2: invalid multibyte sequence
/tmp/te1.po:2: invalid multibyte sequence
/tmp/te1.po:2: invalid multibyte sequence
/tmp/te1.po:2: invalid multibyte sequence
/tmp/te1.po:2: invalid multibyte sequence
/tmp/te1.po:2: invalid multibyte sequence
/tmp/te1.po:2: invalid multibyte sequence
/tmp/te1.po:2: invalid multibyte sequence
/tmp/te1.po:2: invalid multibyte sequence
/tmp/te1.po:2: invalid multibyte sequence
msgstr "AMSTERDAM REISEFHRER FR REISE, UNTERKUNFT, SEHENSWRDIGKEITEN\n"

1 个答案:

答案 0 :(得分:0)

情况不完全相同,但是我遇到了同样的问题,我通过添加正确的Content-type解决了该问题。

我有:

"Content-Type: text/plain; charset=ASCII\n"

这似乎是默认设置。

并将其更改为:

"Content-Type: text/plain; charset=UTF-8\n"

即使我的文件也是UTF-8,我也必须明确地更改Content-Type中的字符集