preg_match_all拆分条件表达式

时间:2017-10-11 17:23:51

标签: php preg-match-all

我有这种格式的数据:

Randomtext1(random2, random4) Randomtext2 (ran dom) Randomtext3 Randomtext4 (random5,random7,random8) Randomtext5 (Randomtext4 (random5,random7,random8), random10) Randomtext11()

用这个:

preg_match_all("/\b\w+\b(?:\s*\(.*?\)|)/",$text,$matches);

我获得:

0 => 'Randomtext1(random2, random4)',
1 => 'Randomtext2 (ran dom)',
2 => 'Randomtext3',
3 => 'Randomtext4 (random5,random7,random8)',
4 => 'Randomtext5 (Randomtext4 (random5,random7,random8)',
5 => 'random10',
6 => 'Randomtext11()',

但我想要

0 => 'Randomtext1(random2, random4)',
1 => 'Randomtext2 (ran dom)',
2 => 'Randomtext3',
3 => 'Randomtext4 (random5,random7,random8)'
4 => 'Randomtext5 (Randomtext4 (random5,random7,random8), random10)' 
5 => 'Randomtext11()'

有什么想法吗?

1 个答案:

答案 0 :(得分:0)

您需要一个递归模式来处理嵌套的括号:

if ( preg_match_all('~\w+(?:\s*(\([^()]*+(?:(?1)[^()]*)*+\)))?~', $text, $matches) )
    print_r($matches[0]);

demo

细节:

~    # delimiter
\w+
(?:
    \s*
    ( # capture group 1
        \(
        [^()]*+  # all that isn't a round bracket 
                 # (possessive quantifier *+ to prevent too many backtracking
                 # steps in case of badly formatted string)
        (?:
            (?1) # recursion in the capture group 1
            [^()]*
        )*+
        \)
     )  # close the capture group 1
)? # to make the group optional (instead of "|)")
~

请注意,您不需要在\w+

周围添加字边界