从字符串解码html实体时,特殊字符显示不正确

时间:2014-01-24 11:37:45

标签: php decoding

function decode_entities($text) {
    $text= html_entity_decode($text,ENT_QUOTES,"ISO-8859-1"); #NOTE: UTF-8 does not work!
    $text= preg_replace('/&#(\d+);/me',"chr(\\1)",$text); #decimal notation
    $text= preg_replace('/&#x([a-f0-9]+);/mei',"chr(0x\\1)",$text);  #hex notation
    return $text;
}

echo decode_entities("For tiden er neste president i det afrikanske landet Burkina Faso 11 år
");

echo html_entity_decode("For tiden er neste president i det afrikanske landet Burkina Faso 11 år
",'UTF-8');

我正在使用上面的函数从字符串中解码HTML实体,但解码特殊字符时显示不正确Demo

2 个答案:

答案 0 :(得分:1)

尝试使用回显强制显示的字符集...

echo "<meta charset='UTF-8'>";
echo html_entity_decode("For tiden er neste president i det afrikanske landet Burkina Faso 11 &aring;r",'UTF-8');

答案 1 :(得分:0)

对我来说,html_entity_decode的UTF-8 charset agrument工作得很好。测试你的phpfiddle脚本。 如果不是,请尝试使用header('Content-Encoding: UTF-8');

设置内容编码标头

考虑到示例中错误的参数位置,适合我的代码如下所示:

header('Content-Encoding: UTF-8');
echo html_entity_decode("For tiden er neste president i det afrikanske landet Burkina Faso 11 &aring;r", ENT_QUOTES, 'UTF-8');