Question

老实说，我认为我应该首先请求你帮助解决这个问题的语法。

但是，如果你能理解我的意思，请用适当的编辑标题。

有没有办法制作可以拆分文字的模式。

{{START}}
    {{START}}
        {{START}}
            {{START}}
            {{END}}
        {{END}}
    {{END}}
{{END}}

所以{{START}}每个{{END}}都会从最后一个到最后一个与{{END}}匹配！

如果我不能只使用正则表达式那样做。用PHP做什么呢？

先谢谢你。

Answer 1

这超出了正则表达式的功能，正则表达式只能解析常规语法。您所描述的内容需要下推自动机（常规语言由regular automaton定义）。

您可以使用正则表达式来解析单个元素，但“深度”部分需要由具有内存概念的语言处理（PHP可以用于此）。

因此，在您的解决方案中，正则表达式将仅用于标识您的标记，而跟踪深度和确定END标记所属元素的真实逻辑必须是您的程序本身。

Answer 2

有可能！您可以使用递归正则表达式获得每个级别的内容：

$data = <<<LOD
{{START1}}
    aaaaa
    {{START2}}
        bbbbb
        {{START3}}
            ccccc
            {{START4}}
                ddddd
            {{END4}}
        {{END3}}
    {{END2}}
{{END1}}
LOD;

$pattern = '~(?=({{START\d+}}(?>[^{]++|(?1))*{{END\d+}}))~';
preg_match_all ($pattern, $data, $matches);

print_r($matches);

解释：

部分：({{START\d+}}(?>[^{]++|(?1))*{{END\d+}})

该模式的这一部分描述了具有{{START#}}和{{END#}}

的嵌套结构

(             # open the first capturing group
{{START\d+}}  
(?>           # open an atomic group (= backtracks forbidden)
    [^{]++    # all that is not a { one or more times (possessive)
  |           # OR
    (?1)      # refer to the first capturing group itself
)             # close the atomic group
{END\d+}}     # 
)             # close the first capturing group

现在的问题是你无法仅使用此部分捕获所有级别，因为字符串的所有字符都由模式使用。换句话说，您无法匹配字符串的重叠部分。

问题是将所有这部分包装在零宽度断言中，该断言不消耗像前瞻(?=...)这样的字符，结果：

(?=({{START\d+}}(?>[^{]++|(?1))*{{END\d+}}))

这将匹配所有级别。

Answer 3

你不能用纯RegEx做到这一点，但是只需一个简单的循环即可完成。

JS示例：

//[.\s\S]* ensures line breaks are matched (dotall not supported in JS)
var exp = /\{\{START\}\}([.\s\S]*)\{\{END\}\}/;

var myString = "{{START}}\ntest\n{{START}}\ntest 2\n{{START}}\ntest 3\n{{START}}\ntest4\n{{END}}\n{{END}}\n{{END}}\n{{END}}";

var matches = [];
var m = exp.exec(myString);
while ( m != null ) {
    matches.push(m[0]);
    m = exp.exec(m[1]);
}

alert(matches.join("\n\n"));

PHP（我不知道这是不是正确，因为我已经完成了PHP，所以这是永远的）

$pattern = "/\{\{START\}\}([.\s\S]*)\{\{END\}\}/";
$myString = "{{START}}\ntest\n{{START}}\ntest 2\n{{START}}\ntest 3\n{{START}}\ntest4\n{{END}}\n{{END}}\n{{END}}\n{{END}}";

$result = preg_match($pattern, $myString, $matches, PREG_OFFSET_CAPTURE);
$outMatches = array();
while ( $result ) {
    array_push($outMatches, $matches[0]);
    $result = preg_match($pattern, $matches[1], $matches, PREG_OFFSET_CAPTURE);
}
print($outMatches);

输出：

{{START}}
test
{{START}}
test 2
{{START}}
test 3
{{START}}
test4
{{END}}
{{END}}
{{END}}
{{END}}

{{START}}
test 2
{{START}}
test 3
{{START}}
test4
{{END}}
{{END}}
{{END}}

{{START}}
test 3
{{START}}
test4
{{END}}
{{END}}

{{START}}
test4
{{END}}

如何使用正则表达式创建循环？

3 个答案: