Question

我有一个奇怪的问题，代码如下：

$str = "נסיון" // <--- Hebrew chars
echo mb_detect_encoding ($str)."<br><br><br>";
$str = iconv (mb_detect_encoding($str),'UCS-2BE',$str);
echo mb_detect_encoding ($str)."<br><br><br>";

这将输出：

UTF-8

此代码是用UTF-8编码的文件（使用Notepad ++）编写的，没有BOM，尝试其他编码但不起作用。

我也尝试使用以下方法转换字符串：

$str = mb_convert_encoding($str,'UCS-2BE');

但这也不起作用。任何见解？

Answer 1

从documentation for mb_detect_order开始，该函数确定mb_detect_encoding测试不同编码的顺序：

mbstring 目前实现了以下编码检测过滤器。如果以下编码存在无效的字节序列，则编码检测将失败。   UTF-8，UTF-7，ASCII，EUC-JP，SJIS，eucJP-win，SJIS-win，JIS，ISO-2022-JP

对于ISO-8859- *，mbstring始终检测为ISO-8859 - *。

对于UTF-16，UTF-32，UCS2和UCS4，编码检测将始终失败。

因此，您无法使用mb函数检测第二个字符串的编码。

mb_detect_encoding显示相同的编码

1 个答案: