我想使用preg_match_all()来提取[[和]]之间的内容,但忽略[[[and]]],所以例如这个文本:
$text = <<<TEXT
Some text going here
[[ 1. this is a text ]]
another text but multiple lines
[[ 2. this
is a
text ]]
This should be ignored, haveing 3 on the left
[[[ 3. this is a text ]]
This should be ignored, haveing 3 on the right
[[ 4. this is a text ]]]
This should be ignored, haveing 3 both on the left and right
[[[ 5. this is a text ]]]
This is the final sentence.
[[ 6. this is a text ]]
TEXT;
if (preg_match_all("(?!<\[)(\[\[.*?\]\])(?!\[)", $text, $tags, PREG_PATTERN_ORDER)) {
$tags = $tags[0];
}
echo '<pre>';
print_r(tags);
echo '</pre>';
所以只选择1.,2。和6.但我上面试过的正则表达式是选择除了2.之外的所有内容,而不是按预期工作。
答案 0 :(得分:4)
您可以使用此模式:
preg_match_all('~(?<!\[)\[\[(?!\[)([^]]*)]](?!])~', $text, $tags);
注意:
无需指定PREG_PATTERN_ORDER,因为它是preg_match *函数的默认设置
我已经为方括号内的内容添加了捕获括号,如果您不需要,可以删除它们
如果标签内不允许使用方括号,则可以将模式缩短为:
~(?<!\[)\[\[([^][]*)]](?!])~
答案 1 :(得分:1)
这是一个应该完成这项工作的正则表达式:
((?<!\[)\[\[([^\[][^\]]*)\]\](?!\]))
打破这个
这应该是防弹,除了它[[和]]之间至少需要1个字符。
答案 2 :(得分:0)
尝试:
preg_match_all('/(\A|[^[])\[{2}[^[](?<content>[^]]+)[^]]\]{2}([^]]|\z)/s', ...)
http://codepad.viper-7.com/bbs3oR
Array
(
[0] => Array
(
[0] =>
[[ 1. this is a text ]]
[1] =>
[[ 2. this
is a
text ]]
[2] =>
[[ 6. this is a text ]]
)
[1] => Array
(
[0] => 1. this is a text
[1] => 2. this
is a
text
[2] => 6. this is a text
)
[2] => Array
(
[0] =>
[1] =>
[2] =>
)
)