我想将字符串截断为一定数量的字符。该字符串包含html字符。请注意,我从字符串中删除了所有html标记。现在,如果断点处有一个特殊字符,它不应该在html字符的中间,而是在之前或之后。这些示例不起作用:
//example 1
$str = "French for French is français";
$str = substr($str, 0, 27);
//$str contains "French for French is fran&c";
//example 2
$str = "the en dash looks like –";
$str = substr($str, 0, 25);
//$str contains "the en dash looks like &#";
所以我想我应该首先将特殊字符转换为单个字符,进行截断然后将单个字符还原为特殊字符。它似乎适用于第一个例子,但不是第二个例子。
//example 1
$str = "French for French is français";
$str = html_entity_decode($str);
$str = substr($str, 0, 27);
$str = htmlentities($str);
//$str contains "French for French is frança";
//example 2
$str = "the en dash looks like –";
$str = html_entity_decode($str);
$str = substr($str, 0, 25);
$str = htmlentities($str);
//$str contains "the en dash looks like &#";
如果两个示例都按照我期望的方式运行,我应该更改什么?
答案 0 :(得分:2)
htmlentities默认使用您的default_charset
php.ini值进行编码。如果您没有使用支持您要转换的实体的字符集,则它可能不会按预期运行。试试这个,看看你是否得到不同的结果。
htmlentities($str, null, 'utf-8');
html_entity_decode($str, null, 'utf-8');
mb_substr($str, 0, 25, 'utf-8');