用正则表达式获取标记内的引号

时间:2016-07-25 13:11:00

标签: php regex preg-replace

那边是Hy。我试图在特定的起始字符串中获取所有引号。 假设我有这个字符串:

`Hello "world". [start]this is a "mark"[end]. It should work with [start]"several" "marks"[end]`

现在我希望[开始] .. [结束]内的所有内容都被"替换:

$string = 'Hello "world". [start]this is a "mark"[end]. It should work with [start]"several" "marks"[end]';
$regex = '/(?<=\[start])(.*?)(?=\[end])/';
$replace = '&quot;';

$string = preg_replace($regex,$replace,$string);

这匹配[start]和[end]之间的文本。但我希望匹配“内部:

//expected: Hello "world". [start]this is a &quot;mark&quot;[end]. It should work with [start]&quot;several&quot; &quot;marks&quot;[end]

任何想法?

2 个答案:

答案 0 :(得分:3)

(?s)"(?=((?!\[start\]).)*\[end\])

Live demo

说明:

 (?s)                       DOT_ALL modifier
 "                          Literal "
 (?=                        Begin lookahead
      (                         # (1 start)
           (?! \[start\] )          Current position should not be followed by [start]
           .                        If yes then match
      )*                        # (1 end)
      \[end\]                   Until reaching [end]
 )                          End lookahead

PHP live demo

答案 1 :(得分:0)

使用preg_replace_callback的方法允许使用更简单的正则表达式(考虑到您的字符串始终具有成对的非嵌套[start]...[end]对):

$string = 'Hello "world". [start]this is a "mark"[end]. It should work with [start]"several" "marks"[end]';
$regex = '/\[start].*?\[end]/s';
$string = preg_replace_callback($regex, function($m) {
    return str_replace('"', '&quot;', $m[0]);
},$string);
echo $string;
// => Hello "world". [start]this is a &quot;mark&quot;[end]. It should work with [start]&quot;several&quot; &quot;marks&quot;[end]

请参阅PHP IDEONE demo

'/\[start].*?\[end]/s'正则表达式与[start]匹配,然后是任何0+字符(包括自使用/s DOTALL修饰符后的换行符,然后是[end]

如果您需要确保第一个[start][end]之间的最短窗口,则需要使用带有驯化贪婪令牌的正则表达式,如Revo的答案:'/\[start](?:(?!\[(?:start|end)]).)*\[end]/s'(请参阅PHP demoregex demo)。