我需要将字符串从一种编码(UTF-8)转换为另一种编码。问题是在目标编码中我们没有来自源编码的所有字符,并且libc iconv(3)函数在这种情况下失败。我想要的是能够执行转换,但在输出字符串中,这个有问题的字符被替换为某些符号,比如'?'。
编程语言是C或C ++。
有没有办法解决这个问题?
答案 0 :(得分:2)
尝试将“// TRANSLIT”或“// IGNORE”附加到目标字符集字符串的末尾。请注意,这只在GNU C库下支持。
//TRANSLIT
When the string "//TRANSLIT" is appended to tocode, translitera‐
tion is activated. This means that when a character cannot be
represented in the target character set, it can be approximated
through one or several similarly looking characters.
//IGNORE
When the string "//IGNORE" is appended to tocode, characters
that cannot be represented in the target character set will be
silently discarded.
或者,当你从iconv(3)获得-EILSEQ时,手动跳过一个字符并在输出中插入一个替换。
答案 1 :(得分:0)
正则表达式基于可翻译的源范围,用于将相应的占位符交换为任何不匹配的字符。