考虑到并非所有unicode组合字符都具有等效的预组合字符(NFC),是否有办法使用PHP获取字符串的“渲染”长度,如果可能/使语义有意义?
http://3v4l.org/L1kPl(使用php7转义语法)
<?php
echo $s = "\u{0071}\u{0307}\u{0323}";
echo "\n";
echo mb_strlen(Normalizer::normalize($s, Normalizer::FORM_C), "UTF-8");
// Shows 3 because there is no precomposed equivalent
// for such glyph. I want to get 1 instead
到目前为止我取得的成就:http://3v4l.org/4NSCi
<?php
echo $s = "\u{0071}\u{0307}\u{0323}";
$r = Normalizer::normalize($s, Normalizer::FORM_C);
echo mb_strlen(preg_replace("@\p{Mn}@u", "", $r), "UTF-8");