如何将Html代码转换为相关的Unicode字符

时间:2013-04-05 10:21:48

标签: php unicode

实际上,我已经google了很多,而且我也在探讨这个论坛,但这是我的第二天,我找不到解决方案。

我的问题是我想转换Html代码

باخ

到其等同的unicode字符

خ ا ب

实际上我不想将所有html符号转换为unicode字符。我只想将arabic / urdu html代码转换为unicode字符。这些字符的范围是from ؛ To ۹如果没有任何PHP函数,那么如何一次性用等效的unicode字符替换代码?

3 个答案:

答案 0 :(得分:4)

我认为你在寻找:

html_entity_decode('باخ', ENT_QUOTES, 'UTF-8');

当你从ب到ب,这叫做解码。相反的做法称为编码。

至于仅更换؛至۹也许尝试这样的事情。

<?php

// Random set of entities, two are outside the 1563 - 1785 range.
$entities = '&#1563;&#1564;&#60;&#1604;&#241;&#1784;&#1785;';

// Matches entities from 1500 to 1799, not perfect, I know.
preg_match_all('/&#1[5-7][0-9]{2};/', $entities, $matches);

$entityRegex = array(); // Will hold the entity code regular expression.
$decodedCharacters = array(); // Will hold the decoded characters.

foreach ($matches[0] as $entity)
{
    // Convert the entity to human-readable character.
    $unicodeCharacter = html_entity_decode($entity, ENT_QUOTES, 'UTF-8');

    array_push($entityRegex, "/$entity/");
    array_push($decodedCharacters, $unicodeCharacter);
}

// Replace all of the matched entities with the human-readable character.
$replaced = preg_replace($entityRegex, $decodedCharacters, $entities);

?>

尽可能接近解决这个问题。希望这有点帮助。我现在是凌晨5点,所以我要睡觉了! :)

答案 1 :(得分:0)

你是否在html head中尝试了utf-8编码?

<meta http-equiv="Content-type" content="text/html; charset=utf-8" />

答案 2 :(得分:0)

试试这个

 <?php
$trans_tbl = get_html_translation_table(HTML_ENTITIES);
foreach($trans_tbl as $k => $v)
{
    $ttr[$v] = utf8_encode($k);
}
$text = '&#1576;&#1576;....;&#1582';
$text = strtr($text, $ttr);
echo $text;
 ?>

对于mysql解决方案,您可以将字符集设置为

 $mysqli = new mysqli($host, $user, $pass, $db);

   if (!$mysqli->set_charset("utf8")) {
    die("error");

    }