Question

假设用户提交了注释，我想获取其值的Unicode代码点数组，选择哪些代码点无效并丢弃它们，然后保存注释。我怎么能这样做？

e.g。

用户提交“hello”，我想获得一个具有以下值的数组$codepoints：

$codepoints[0] = 0068
$codepoints[1] = 0065
$codepoints[2] = 006C
$codepoints[3] = 006C
$codepoints[4] = 006F

并且，由于一些奇怪的原因，我不想允许字母“l”，所以我想丢弃代码点为U + 006C的字符。所以保存的评论将是“heo”。这甚至可能吗？

提前致谢！

Answer 1

以下是unicode文字的示例。

mb_internal_encoding('utf-8');
mb_regex_encoding('utf-8');
echo mb_ereg_replace('[•]', '', '•T•e•s•t•');

这将输出字符串Test。

如果您更愿意用十六进制编写代码点，this answer可能会有用。