[quote]something [quote]something else[/quote] some text here[/quote]
$ matches [6] [0]:[quote]something [quote]something else[/quote]
$ matches [6] [0]:[quote]something [quote]something else[/quote] some text here[/quote]
$ matches [6] [1]:[quote]something else[/quote]
答案 0 :(得分:0)
要匹配嵌套结构,您需要一个递归模式,例如:
$data = '[quote]something [quote]something else[/quote] some text here[/quote]';
$pattern = '~\[quote](?>[^][]+|(?R))*\[/quote]~';
if (preg_match_all($pattern, $data, $m))
print_r(m);
模式细节:
~ # pattern delimiter: do not choose the slash here
\[quote] #
(?> # open an atomic group: possible content between tags
[^][]+ # all that is not a square bracket
| # OR
(?R) # recurse the whole pattern
)* # close the atomic group, repeat zero or more times
\[/quote] #
~
请注意,这很容易。但是现在,如果您的代码可能在“quote”标记之间包含其他寄生标记,则只需更改原子组以允许它们(以扩展模式编写):
(?> [^][]+ | \[/? (?!quote\b) [^]]* ] | (?R) )*
答案 1 :(得分:0)
如果你觉得非常雄心勃勃,你可以坐在一段时间的搜索循环中,并建立一个嵌套的内容数组。匹配的每个新核心需要导致对执行此正则表达式的解析函数的重入调用。
# //////////////////////////////////////////////////////
# // The General Guide to 3-Part Recursive Parsing
# // ----------------------------------------------
# // Part 1. CONTENT
# // Part 2. CORE
# // Part 3. ERRORS
(?is)
(?:
( # (1), Take off CONTENT
(?&content)
)
| # OR
\[quote\] # Start-Delimiter
( # (2), Take off The CORE
(?&core)
|
)
\[/quote\] # End-Delimiter
| # OR
( # (3), Take off Unbalanced (delimeter) ERRORS
\[/?quote\]
)
)
# ///////////////////////
# // Subroutines
# // ---------------
(?(DEFINE)
# core
(?<core>
(?>
(?&content)
|
\[quote\]
# recurse core
(?:
(?= . )
(?&core)
|
)
\[/quote\]
)+
)
# content
(?<content>
(?>
(?!
\[/?quote\]
)
.
)+
)
)
答案 2 :(得分:0)
您必须构建树结构。查看CodeProject上的 STML Parser 。 STML Parser