为什么不是PHP的htmlentities()转换character字符?

时间:2012-12-07 15:41:54

标签: php html-entities html-encode

我在使用PHP的htmlentities() / htmlspecialchars()函数时遇到了一些问题。我转换的字符串包含字符œ(html等效于'& oelig'),但htmlentities()htmlspecialchars()都没有转换此字符。

当我运行get_html_translation_table(HTML_ENTITIES)以查看PHP正在使用的转换表时,我注意到œ字符丢失,而其他绑定如æ&aelig)存在。为什么是这样?我应该采用不同的方式转换character角色吗?

作为参考,我正在运行PHP 5.3.14,这是get_html_translation_table(HTML_ENTITIES)的输出:

array(100) {
  [" "]=>
  string(6) " "
  ["¡"]=>
  string(7) "¡"
  ["¢"]=>
  string(6) "¢"
  ["£"]=>
  string(7) "£"
  ["¤"]=>
  string(8) "¤"
  ["¥"]=>
  string(5) "¥"
  ["¦"]=>
  string(8) "¦"
  ["§"]=>
  string(6) "§"
  ["¨"]=>
  string(5) "¨"
  ["©"]=>
  string(6) "©"
  ["ª"]=>
  string(6) "ª"
  ["«"]=>
  string(7) "«"
  ["¬"]=>
  string(5) "¬"
  ["­"]=>
  string(5) "­"
  ["®"]=>
  string(5) "®"
  ["¯"]=>
  string(6) "¯"
  ["°"]=>
  string(5) "°"
  ["±"]=>
  string(8) "±"
  ["²"]=>
  string(6) "²"
  ["³"]=>
  string(6) "³"
  ["´"]=>
  string(7) "´"
  ["µ"]=>
  string(7) "µ"
  ["¶"]=>
  string(6) "¶"
  ["·"]=>
  string(8) "·"
  ["¸"]=>
  string(7) "¸"
  ["¹"]=>
  string(6) "¹"
  ["º"]=>
  string(6) "º"
  ["»"]=>
  string(7) "»"
  ["¼"]=>
  string(8) "¼"
  ["½"]=>
  string(8) "½"
  ["¾"]=>
  string(8) "¾"
  ["¿"]=>
  string(8) "¿"
  ["À"]=>
  string(8) "À"
  ["Á"]=>
  string(8) "Á"
  ["Â"]=>
  string(7) "Â"
  ["Ã"]=>
  string(8) "Ã"
  ["Ä"]=>
  string(6) "Ä"
  ["Å"]=>
  string(7) "Å"
  ["Æ"]=>
  string(7) "Æ"
  ["Ç"]=>
  string(8) "Ç"
  ["È"]=>
  string(8) "È"
  ["É"]=>
  string(8) "É"
  ["Ê"]=>
  string(7) "Ê"
  ["Ë"]=>
  string(6) "Ë"
  ["Ì"]=>
  string(8) "Ì"
  ["Í"]=>
  string(8) "Í"
  ["Î"]=>
  string(7) "Î"
  ["Ï"]=>
  string(6) "Ï"
  ["Ð"]=>
  string(5) "Ð"
  ["Ñ"]=>
  string(8) "Ñ"
  ["Ò"]=>
  string(8) "Ò"
  ["Ó"]=>
  string(8) "Ó"
  ["Ô"]=>
  string(7) "Ô"
  ["Õ"]=>
  string(8) "Õ"
  ["Ö"]=>
  string(6) "Ö"
  ["×"]=>
  string(7) "×"
  ["Ø"]=>
  string(8) "Ø"
  ["Ù"]=>
  string(8) "Ù"
  ["Ú"]=>
  string(8) "Ú"
  ["Û"]=>
  string(7) "Û"
  ["Ü"]=>
  string(6) "Ü"
  ["Ý"]=>
  string(8) "Ý"
  ["Þ"]=>
  string(7) "Þ"
  ["ß"]=>
  string(7) "ß"
  ["à"]=>
  string(8) "à"
  ["á"]=>
  string(8) "á"
  ["â"]=>
  string(7) "â"
  ["ã"]=>
  string(8) "ã"
  ["ä"]=>
  string(6) "ä"
  ["å"]=>
  string(7) "å"
  ["æ"]=>
  string(7) "æ"
  ["ç"]=>
  string(8) "ç"
  ["è"]=>
  string(8) "è"
  ["é"]=>
  string(8) "é"
  ["ê"]=>
  string(7) "ê"
  ["ë"]=>
  string(6) "ë"
  ["ì"]=>
  string(8) "ì"
  ["í"]=>
  string(8) "í"
  ["î"]=>
  string(7) "î"
  ["ï"]=>
  string(6) "ï"
  ["ð"]=>
  string(5) "ð"
  ["ñ"]=>
  string(8) "ñ"
  ["ò"]=>
  string(8) "ò"
  ["ó"]=>
  string(8) "ó"
  ["ô"]=>
  string(7) "ô"
  ["õ"]=>
  string(8) "õ"
  ["ö"]=>
  string(6) "ö"
  ["÷"]=>
  string(8) "÷"
  ["ø"]=>
  string(8) "ø"
  ["ù"]=>
  string(8) "ù"
  ["ú"]=>
  string(8) "ú"
  ["û"]=>
  string(7) "û"
  ["ü"]=>
  string(6) "ü"
  ["ý"]=>
  string(8) "ý"
  ["þ"]=>
  string(7) "þ"
  ["ÿ"]=>
  string(6) "ÿ"
  ["&"]=>
  string(5) "&"
  ["""]=>
  string(6) """
  ["<"]=>
  string(4) "&lt;"
  [">"]=>
  string(4) "&gt;"
}

1 个答案:

答案 0 :(得分:2)

我尝试使用PHP 5.4.5并正确输出&amp; oelig。所以我不能真正测试这个,但我猜它是因为不是默认使用的iso-8859-1的实际字符集。它们在补充字符集中。尝试使用ISO-8859-15

htmlentities($s,  ENT_COMPAT | ENT_HTML401, "ISO-8859-15");