在文本中搜索关键字之前截断内容

时间:2011-10-31 09:59:14

标签: php preg-replace truncate str-replace

我使用下面的代码在我的文本中的第一个搜索关键字之前和之后截断我的内容(这是针对我的搜索页面)一切正常,除了截断开头的代码切割一半之外,它不会在截断的末尾剪切单词。

示例:

lients at the centre of the relationship and to offer a first class service to them, which includes tax planning, investment management and estate planning. We believe that our customer focused and...

(编辑:有时候单词中缺少多个字符)

你会发现它已经将'c'从'客户'中删除了。它只发生在文本的开头而不是结尾。我怎样才能解决这个问题?我相信我已经走了一半。代码到目前为止:

function neatest_trim($content, $chars, $searchquery,$characters_before,$characters_after) {
            if (strlen($content) > $chars) {
                 $pos = strpos($content, $searchquery);
                 $start = $characters_before < $pos ? $pos - $characters_before : 0;
                $len = $pos + strlen($searchquery) + $characters_after - $start;
                $content = str_replace('&nbsp;', ' ', $content);
                $content = str_replace("\n", '', $content);
                $content = strip_tags(trim($content));
                $content = preg_replace('/\s+?(\S+)?$/', '', mb_substr($content, $start, $len));
                $content = trim($content) . '...';
                $content = strip_tags($content);
                $content = str_ireplace($searchquery, '<span class="highlight" style="background: #E6E6E6;">' . $searchquery . '</span>', $content);
            }
            return $content;
        }



 $results[] = Array(
  'text' => neatest_trim($row->content,200,$searchquery,120,80)
            );

2 个答案:

答案 0 :(得分:0)

你在开始时保留的120个字符不会检查第120个字符是空格还是字母,无论如何都会在那里剪切字符串。

我会进行此更改,以搜索距离我们开始的位置最近的“空格”。

$start = $characters_before < $pos ? $pos - $characters_before : 0;
// add this line:
$start = strpos($content, ' ', $start);
$len = $pos + strlen($searchquery) + $characters_after - $start;

这种方式$start是空格的位置,而不是单词的字母。

你的职能将成为:

function neatest_trim($content, $chars, $searchquery,$characters_before,$characters_after) {
    if (strlen($content) > $chars) {
    $pos = strpos($content, $searchquery);
    $start = $characters_before < $pos ? $pos - $characters_before : 0;
    $start = strpos($content, " ", $start);
    $len = $pos + strlen($searchquery) + $characters_after - $start;
    $content = str_replace('&nbsp;', ' ', $content);
    $content = str_replace("\n", '', $content);
    $content = strip_tags(trim($content));
    $content = preg_replace('/\s+?(\S+)?$/', '', mb_substr($content, $start, $len));
    $content = trim($content) . '...';
    $content = strip_tags($content);
    $content = str_ireplace($searchquery, '<span class="highlight" style="background: #E6E6E6;">' . $searchquery . '</span>', $content);
    }
    return $content;
  }

答案 1 :(得分:0)

为什么不使用替换正则表达式?

$result = preg_replace('/.*(.{10}\bword\b.{10}).*/s', '$1', $subject);

因此,这将修剪关键字'word'

之前和之后的所有10个字符

说明:

# .*(.{10}\bword\b.{10}).*
# 
# Options: dot matches newline
# 
# Match any single character «.*»
#    Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
# Match the regular expression below and capture its match into backreference number 1 «(.{10}\bword\b.{10})»
#    Match any single character «.{10}»
#       Exactly 10 times «{10}»
#    Assert position at a word boundary «\b»
#    Match the characters “word” literally «word»
#    Assert position at a word boundary «\b»
#    Match any single character «.{10}»
#       Exactly 10 times «{10}»
# Match any single character «.*»
#    Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»

所以这个正则表达式的作用是找到你指定的单词(并且只有那个单词因为它包含在\ b - 单词边界中)而且它还找到了ant存储(包括单词)这个单词之前的10个字符以及之后的十个字符。您可以使用前后字符的变量以及关键字自行构造正则表达式。正则表达式也匹配其他所有内容,但替换只使用反向引用$ 1,这是你想要的输出。