[[和]]之间的PHP正则表达式preg_match_all()

时间:2013-12-12 01:02:31

标签: php regex preg-match-all

我想使用preg_match_all()来提取[[和]]之间的内容,但忽略[[[and]]],所以例如这个文本:

$text = <<<TEXT
Some text going here

[[ 1. this is a text ]]

another text but multiple lines

[[ 2. this 
is a 
text ]]

This should be ignored, haveing 3 on the left

[[[ 3. this is a text ]]

This should be ignored, haveing 3 on the right

[[ 4. this is a text ]]]

This should be ignored, haveing 3 both on the left and right

[[[ 5. this is a text ]]]

This is the final sentence.

[[ 6. this is a text ]]
TEXT;

if (preg_match_all("(?!<\[)(\[\[.*?\]\])(?!\[)", $text, $tags, PREG_PATTERN_ORDER)) {
        $tags = $tags[0];
}

echo '<pre>';
print_r(tags);
echo '</pre>';

所以只选择1.,2。和6.但我上面试过的正则表达式是选择除了2.之外的所有内容,而不是按预期工作。

3 个答案:

答案 0 :(得分:4)

您可以使用此模式:

preg_match_all('~(?<!\[)\[\[(?!\[)([^]]*)]](?!])~', $text, $tags);

注意:
无需指定PREG_PATTERN_ORDER,因为它是preg_match *函数的默认设置 我已经为方括号内的内容添加了捕获括号,如果您不需要,可以删除它们 如果标签内不允许使用方括号,则可以将模式缩短为:

~(?<!\[)\[\[([^][]*)]](?!])~

答案 1 :(得分:1)

这是一个应该完成这项工作的正则表达式:

((?<!\[)\[\[([^\[][^\]]*)\]\](?!\]))

REGEX 101

打破这个

  • 任何事情都没有[
  • [[
  • 任何角色,但[
  • 任何字符,但是0次或更多次
  • ]]
  • 没有后跟a]

这应该是防弹,除了它[[和]]之间至少需要1个字符。

答案 2 :(得分:0)

尝试:

preg_match_all('/(\A|[^[])\[{2}[^[](?<content>[^]]+)[^]]\]{2}([^]]|\z)/s', ...)

http://regex101.com/r/jC2mM0

http://codepad.viper-7.com/bbs3oR

Array
(
    [0] => Array
        (
            [0] => 
[[ 1. this is a text ]]
            [1] => 
[[ 2. this 
is a 
text ]]
            [2] => 
[[ 6. this is a text ]]
        )

    [1] => Array
        (
            [0] => 1. this is a text
            [1] => 2. this 
is a 
text
            [2] => 6. this is a text
        )

    [2] => Array
        (
            [0] => 
            [1] => 
            [2] => 
        )

)