我在php中保存了一条记录“فحصالرسالةالعربية”,总是保存为:
فحص الرسالة العربية
当我检索它时,我想将其转换为UTF-16BE字符,因此我使用的函数返回:
002600230031003600300031003b002600230031003500380031003b002600230031003500380039003b0020002600230031003500370035003b002600230031003600300034003b002600230031003500380035003b002600230031003500380037003b002600230031003500370035003b002600230031003600300034003b002600230031003500370037003b0020002600230031003500370035003b002600230031003600300034003b002600230031003500390033003b002600230031003500380035003b002600230031003500370036003b002600230031003600310030003b002600230031003500370037003b
这是用于转换从数据库中检索的字符串的函数
function convertCharsn($string) {
$in = '';
$out = iconv('UTF-8', 'UTF-16BE', $string);
for($i=0; $i<strlen($out); $i++) {
$in .= sprintf("%02X", ord($out[$i]));
}
return $in;
}
但是当我在下面的url中键入相同的字符时,它与我的字符串相比显示不同的字符。 http://www.routesms.com/downloads/onlineunicode.asp
返回:
0641062D063500200627064406310633062706440629002006270644063906310628064A0629
我希望我的字符串在上面的url中进行转换时进行转换。 我的数据库排序规则是utf-8_general_ci
答案 0 :(得分:2)
基本上,您需要先从HTML实体中解码这些字符。只需使用html_entity_decode()
$rawChars = html_entity_decode($string, ENT_QUOTES | ENT_HTML401, 'UTF-8');
convertCharsn($rawChars);
否则,您只是对实体进行编码。您可以看到,&
为0026
,UTF16为#
,0023
为00260023
。因此,您可以在上面发布的转码中看到{{1}}的重复序列。所以首先解码它,你应该设置......