Question

我需要将特殊字符（在本例中为Ð字符）从cp850转换为unicode，而我无法使用mb_convert_encoding进行转换。正确的转换应该是从西班牙语中的Ð到Ñ，但函数mb_convert_enconding（'Ð'，'utf-8'）返回Ã。

你知道为什么会这样吗？

提前致谢。

Answer 1

如果将utf8_encode（）应用于已经是UTF8的字符串，它将返回一个乱码的UTF8输出。

我做了一个解决所有这些问题的函数。它被称为Encoding::toUTF8()。

您不需要知道字符串的编码是什么。它可以是Latin1（iso 8859-1），Windows-1252或UTF8，或者字符串可以混合使用它们。 Encoding::toUTF8()会将所有内容转换为UTF8。

用法：

require_once('Encoding.php'); 
use \ForceUTF8\Encoding;  // It's namespaced now.
$utf8_string = Encoding::fixUTF8($garbled_utf8_string);

下载：

https://github.com/neitanod/forceutf8

示例：

echo Encoding::fixUTF8("FÃ©dÃ©ration Camerounaise de Football");
echo Encoding::fixUTF8("FÃÂ©dÃÂ©ration Camerounaise de Football");
echo Encoding::fixUTF8("FÃÂÃÂ©dÃÂÃÂ©ration Camerounaise de Football");
echo Encoding::fixUTF8("FÃÂ©dération Camerounaise de Football");

将输出：

Fédération Camerounaise de Football
Fédération Camerounaise de Football
Fédération Camerounaise de Football
Fédération Camerounaise de Football

我已经将函数（forceUTF8）转换为一个名为Encoding的类的静态函数族。新功能是Encoding::toUTF8()。

Answer 2

您需要传递源编码：

print mb_convert_enconding('Ð', 'utf-8', 'CP850');

如果不这样做，默认顺序将用于尝试猜测原始编码，并且通常首先检测UTF8和/或Latin1。

将cp850转换为utf - 字符Ð

2 个答案: