PHP在关键字前后提取字符串的一部分,然后替换除关键字之外的所有内容

时间:2014-11-09 05:43:40

标签: php regex string search replace

我应该处理一个字符串:

$ str =" Lorem ipsum dolor 之前我的关键字1之后之后,sed执行eiusmod tempor incididunt ut 之前我的关键字2之后之后 magna aliqua。 在我的关键字之前,我会使用adim veniam,quis 实施之后实施ullamco laboris nisi ut aliquip ex ea commodo consequat";

$arr_keywords = array("MY KEYWORD 1", "MY KEYWORD 2");
  1. 在关键字前后提取字符串的一部分:
  2. =>关键字数组

    array(
            before Before MY KEYWORD 1 After after,
            before Before MY KEYWORD 1 After after,
            before Before MY KEYWORD 2 After after
    )
    
    1. 然后替换字符串中的所有内容(e => U),但关键字
    2. 除外

      结果:

      " LORUm ipsum dolor 在我的关键字之前1之后之后,请在我的关键字之前执行Uiusmod tUmpor incididunt ut 之后 Ut dolorU magna aliqua。 Ut Unim ad minim vUniam,quis 之前我的关键词1之后 UxUrcitation ullamco laboris nisi ut aliquip Ux Ua commodo consuquat"

      有关如何执行此操作的任何建议吗?

      谢谢!

2 个答案:

答案 0 :(得分:0)

看看你如何使用它:

$str = 'Lorem ipsum dolor before Before MY KEYWORD 1 After after, sed do eiusmod tempor incididunt ut before Before MY KEYWORD 2 After after et dolore magna aliqua. Ut enim ad minim veniam, quis before Before MY KEYWORD 1 After after exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat';
$arr_keywords = array("MY KEYWORD 1", "MY KEYWORD 2");

$currpos = 0;
$newstr = '';
$kwds_plus_surround = array();
$len = strlen($str);
while ($currpos < $len) {
    // Search for the earliest match of any of the keywords from our current position.
    list($newpos, $kw_index) = strpos_arr($str, $arr_keywords, $currpos);

    if ($newpos == -1) {
        // We're beyond the last keyword - do replacement to the end and
        // add to the output.
        $newstr .= do_replace(substr($str, $currpos));
        $currpos = $len + 1;
    } else {
        // Found a keyword.
        // Now look two words back (separating words on single spaces).
        $secondspc_back = $newpos - 1;
        for ($i = 2; $i > 0; $i--) {
            $secondspc_back = strrpos($str, ' ', $secondspc_back - $len - 1);
            if ($secondspc_back === false) break;
        }
        if ($secondspc_back === false || $secondspc_back < $currpos) {
            $secondspc_back = $currpos;
        } else  $secondspc_back++;

        // Do replacement on the stuff between the previous keyword
        // (plus 2 words after) and this one (minus two words before),
        // and add to the output.
        $in_between = substr($str, $currpos, $secondspc_back - $currpos);
        $newstr .= do_replace($in_between);

        // Now look two words forward (separating words on single spaces).
        $secondspc_fwd = $newpos + strlen($arr_keywords[$kw_index]);
        for ($i = 2; $i > 0; $i--) {
            $secondspc_fwd = strpos($str, ' ', $secondspc_fwd + 1);
            if ($secondspc_fwd === false) break;
        }
        if ($secondspc_fwd === false) $secondspc_fwd = $len + 1;

        // Add the keyword plus two words before and after to both the array
        // and the output.
        $kw_plus = substr($str, $secondspc_back, $secondspc_fwd - $secondspc_back);
        $kwds_plus_surround[] = $kw_plus;
        $newstr .= $kw_plus;

        // Update our current position in the string.
        $currpos = $secondspc_fwd;
    }

}

echo 'ORG: '.$str."\n\n".'NEW: '.$newstr."\n\n";
var_export($kwds_plus_surround);

// Finds the earliest match, if any, of any of the $needles (an array)
// in $str (a string) starting from $currpos (an integer).
// Returns an array whose first member is the index of the earliest match,
// or -1 if no match was found, and whose second member is the index into
// $needles of the entry that matched in the $str.
function strpos_arr($str, $needles, $currpos) {
    $ret = array(-1, -1);
    foreach ($needles as $idx => $needle) {
        $offset = stripos($str, $needle, $currpos);
        if ($offset !== false &&
            ($offset < $ret[0] || $ret[0] == -1)) {
             $ret = array($offset, $idx);
        }
    }
    return $ret;
}

// Replaces in $str all occurrences of 'e' with 'U'.
function do_replace($str) {
    return str_replace('e', 'U', $str);
}

答案 1 :(得分:0)

构建您想要查找的分隔符的正则表达式,并使用之前/之后的业务进行扩充:

$regexp = '('
     . implode(
         '|',
         array_map(
             function ($s) {return "before Before $s After after";},
             $arr_keywords)
         )
    . ')';

根据这些分隔符

分割字符串
$chunks = preg_split("/$regexp/", $str, -1, PREG_SPLIT_DELIM_CAPTURE);

通过迭代块来构建一个新字符串:

$new = '';
foreach ($chunks as $c) {
    $new .= preg_match("/$regexp/", $c)
        ? $c
        : str_replace('e', 'U', $c);
}