我有一个中文字符集作为变量,字符编码为utf-8:
$a='列';
由此,如何将值'5217'分配给字符串($b
)(可能使用UTF-16?但可能有更好的方法)?
代码: http://www.fileformat.info/info/unicode/char/5217/index.htm
答案 0 :(得分:0)
function unicode_decode($str) {
return preg_replace_callback("/((?:[^\x09\x0A\x0D\x20-\x7E]{3})+)/", "decode_callback", $str);
}
function decode_callback($matches) {
$char = mb_convert_encoding($matches[1], "UTF-16", "UTF-8");
$escaped = "";
for ($i = 0, $l = strlen($char); $i < $l; $i += 2) {
$escaped .= "\u" . sprintf("%02x%02x", ord($char[$i]), ord($char[$i+1]));
}
return $escaped;
}
$a='列';
var_dump(unicode_decode($a));
答案 1 :(得分:0)
您可以简单地解析UTF-8:
function utf8ord($c) {
$ord0 = ord($c{0});
if ($ord0 < 0x80) return $ord0;
if ($ord0 < 0xe0) return ($ord0 & 0x1f) << 6 | (ord($c{1}) & 0x3f);
if ($ord0 < 0xf0) return ($ord0 & 0x0f) << 12 | (ord($c{1}) & 0x3f) << 6 | (ord($c{2}) & 0x3f);
return ($ord0 & 0x07) << 18 | (ord($c{1}) & 0x3f) << 12 | (ord($c{2}) & 0x3f) << 6 | (ord($c{3}) & 0x3f);
}
$a = '列';
$b = dechex(utf8ord($a));
var_export($b); // outputs '5217'