Question

我想截断一个包含html标签的字符串，以保存它的结构。

示例：

Hello! My name is Jerald and here is a link to my <a href="#">Blog</a>.

因此，如果我将它的子字符串的最大字符数设置为52，我希望它返回如下字符串：

Hello! My name is Jerald and here is a link to my <a href="#">Bl</a>

我尝试使用strip_tags和strlen函数来计算没有和使用html标记的文本，而不是设置子串的新长度（大小不带+大小和html标记）。但是这种方法打破了html字符串。

Answer 1

我使用它，而不是我的代码，但无法找到我最初找到的地方：

function truncateHTML($html_string, $length, $append = '&hellip;', $is_html = true) {
  $html_string = trim($html_string);
  $append = (strlen(strip_tags($html_string)) > $length) ? $append : '';
  $i = 0;
  $tags = [];

  if ($is_html) {
    preg_match_all('/<[^>]+>([^<]*)/', $html_string, $tag_matches, PREG_OFFSET_CAPTURE | PREG_SET_ORDER);

    foreach($tag_matches as $tag_match) {
      if ($tag_match[0][1] - $i >= $length) {
        break;
      }

      $tag = substr(strtok($tag_match[0][0], " \t\n\r\0\x0B>"), 1);
      if ($tag[0] != '/') {
        $tags[] = $tag;
      }
      elseif (end($tags) == substr($tag, 1)) {
        array_pop($tags);
      }

      $i += $tag_match[1][1] - $tag_match[0][1];
    }
  }

  return substr($html_string, 0, $length = min(strlen($html_string), $length + $i)) . (count($tags = array_reverse($tags)) ? '</' . implode('></', $tags) . '>' : '') . $append;
}

示例：

$my_input_with_html = 'Hello! My name is Jerald and here is a link to my <a href="#">Blog</a>.';
$my_output_with_correct_html = truncateHTML($my_input_with_html, 52);

给你：

Hello! My name is Jerald and here is a link to my <a href="#">Bl</a>&hellip;

演示：https://eval.in/832674

希望这有帮助。

截断字符串保存HTML标记结构

1 个答案: