正则表达式在括号内抓取所有文本,而不是引号

时间:2013-11-17 22:31:10

标签: php regex

我正在尝试匹配{bracket}之间的所有文本,但是如果它在引号中则不是: 例如:

$str = 'value that I {want}, vs value "I do {NOT} want" '

我的结果应该抢夺“想要”,但省略“不”。我已经拼命搜索了stackoverflow的正则表达式,可以执行此操作,没有运气。我已经看到了答案,允许我在引号之间但不在引号之间和括号中得到文本。这甚至可能吗?

如果是这样,它是如何完成的?

到目前为止,这就是我所拥有的:

preg_match_all('/{([^}]*)}/', $str, $matches);

但遗憾的是,它只会将所有文本放在括号内,包括{NOT}

2 个答案:

答案 0 :(得分:6)

一次性完成这项工作非常棘手。我甚至想让它与嵌套括号兼容,所以我们也使用recursive pattern

("|').*?\1(*SKIP)(*FAIL)|\{(?:[^{}]|(?R))*\}

好的,我们来解释一下这个神秘的正则表达式:

("|')                   # match eiter a single quote or a double and put it in group 1
.*?                     # match anything ungreedy until ...
\1                      # match what was matched in group 1
(*SKIP)(*FAIL)          # make it skip this match since it's a quoted set of characters
|                       # or
\{(?:[^{}]|(?R))*\}     # match a pair of brackets (even if they are nested)

Online demo

一些php代码:

$input = <<<INP
value that I {want}, vs value "I do {NOT} want".
Let's make it {nested {this {time}}}
And yes, it's even "{bullet-{proof}}" :)
INP;

preg_match_all('~("|\').*?\1(*SKIP)(*FAIL)|\{(?:[^{}]|(?R))*\}~', $input, $m);

print_r($m[0]);

示例输出:

Array
(
    [0] => {want}
    [1] => {nested {this {time}}}
)

答案 1 :(得分:3)

就个人而言,我会在两次通过中处理此事。第一个删除双引号之间的所有内容,第二个删除你想要的文本。

或许这样的事情:

$str = 'value that I {want}, vs value "I do {NOT} want" ';

// Get rid of everything in between double quotes
$str = preg_replace("/\".*\"/U","",$str);

// Now I can safely grab any text between curly brackets
preg_match_all("/\{(.*)\}/U",$str,$matches);

这里的工作示例:http://3v4l.org/SRnva