Question

我正在尝试获取unicode字符串的UTF-8字节（十进制）。例如：

function unicode_to_utf8_bytes($string) {

}

$text = 'Hello ';
$result = unicode_to_utf8_bytes($text);

var_dump($result);

array(10) {
  [0]=>
  int(72)
  [1]=>
  int(101)
  [2]=>
  int(108)
  [3]=>
  int(108)
  [4]=>
  int(111)
  [5]=>
  int(32)
  [6]=>
  int(240)
  [7]=>
  int(159)
  [8]=>
  int(152)
  [9]=>
  int(128)
}

这里可以看到结果的一个例子：

http://apps.timwhitlock.info/unicode/inspect?s=Hello+%F0%9F%98%80

我觉得我很接近，这是我设法得到的：

function utf8_char_code_at($str, $index) {

    $char = mb_substr($str, $index, 1, 'UTF-8');

    if (mb_check_encoding($char, 'UTF-8')) {
        $ret = mb_convert_encoding($char, 'UTF-32BE', 'UTF-8');
        return hexdec(bin2hex($ret));
    }
    else
        return null;

}

function unicode_to_utf8_bytes($str) { 

    $result = array();

    for ($i=0; $i<mb_strlen($str, '8bit'); $i++)
        $result[] = utf8_char_code_at($str, $i);

    return $result;

}

$string = 'Hello ';

var_dump(unicode_to_utf8_bytes($string));

array(10) {
  [0]=>
  int(72)
  [1]=>
  int(101)
  [2]=>
  int(108)
  [3]=>
  int(108)
  [4]=>
  int(111)
  [5]=>
  int(32)
  [6]=>
  int(128512)
  [7]=>
  int(0)
  [8]=>
  int(0)
  [9]=>
  int(0)
}

非常感谢任何帮助！

Answer 1

这会得到您正在寻找的结果：

$string = 'Hello ';
var_export(ascii_to_dec($string));

function ascii_to_dec($str)
{
  for ($i = 0, $j = strlen($str); $i < $j; $i++) {
    $dec_array[] = ord($str{$i});
  }
  return $dec_array;
}

结果：

array (
  0 => 72,
  1 => 101,
  2 => 108,
  3 => 108,
  4 => 111,
  5 => 32,
  6 => 240,
  7 => 159,
  8 => 152,
  9 => 128,
)

Source

PHP Unicode到UTF-8代码

1 个答案: