如何匹配正则表达式中的多个(N)空格?

时间:2013-05-21 12:58:17

标签: php regex regex-lookarounds

我有一个正则表达式(?<={% start %}).*?(?={% end %}),可以匹配两个自定义标记之间的所有内容。

问题是如果标签内有空格(例如“{%start%}”)并且我添加\s+?条件,则正则表达式失败。以下代码不起作用:(?<={%\s+?start\s+?%}).*?(?={%\s+?end\s+?%})我在PHP中收到错误:

preg_match_all(): Compilation failed: lookbehind assertion is not fixed length at offset 25

如果我删除lookahead / lookbehind:({%\s+?(start|end)\s+%}),则相同的正则表达式有效。

请告知。

1 个答案:

答案 0 :(得分:3)

描述

试试这个permlink

[{]%\s*?\b([^}]*start[^}]*)\b\s*?%[}]\s*?\b(.*?)\b\s*?[{]%\s*\b([^}]*end[^}]*)\b\s*%[}]

这将匹配{%%}括号内的所有文字,并会在将值放入群组之前自动修剪文字。

组0获取整个匹配字符串

  1. 获取开始标记文字
  2. 获取内部文字
  3. 获取结束标记文字
  4. enter image description here

    声明

    如果您将复杂数据嵌套到sub中,这可能会有一些边缘情况,其中正则表达式将失败,如果是这样,那么使用正则表达式可能不是此任务的最佳工具。

    摘要

    [{]%\s*?\b([^}]*start[^}]*)\b\s*?%[}]\s*?\b(.*?)\b\s*?[{]%\s*\b([^}]*end[^}]*)\b\s*%[}]
    Char class [{] matches one of the following chars: {
    % Literal `%`
    \s 0 to infinite times [lazy] Whitespace [\t \r\n\f] 
    \b Word boundary: match in between (^\w|\w$|\W\w|\w\W)
    1st Capturing group ([^}]*start[^}]*) 
    Negated char class [^}] infinite to 0 times matches any char except: }
    start Literal `start`
    Negated char class [^}] infinite to 0 times matches any char except: }
    \b Word boundary: match in between (^\w|\w$|\W\w|\w\W)
    \s 0 to infinite times [lazy] Whitespace [\t \r\n\f] 
    % Literal `%`
    Char class [}] matches one of the following chars: }
    \s 0 to infinite times [lazy] Whitespace [\t \r\n\f] 
    \b Word boundary: match in between (^\w|\w$|\W\w|\w\W)
    2nd Capturing group (.*?) 
    . 0 to infinite times [lazy] Any character (except newline) 
    \b Word boundary: match in between (^\w|\w$|\W\w|\w\W)
    \s 0 to infinite times [lazy] Whitespace [\t \r\n\f] 
    Char class [{] matches one of the following chars: {
    % Literal `%`
    \s infinite to 0 times Whitespace [\t \r\n\f] 
    \b Word boundary: match in between (^\w|\w$|\W\w|\w\W)
    3rd Capturing group ([^}]*end[^}]*) 
    Negated char class [^}] infinite to 0 times matches any char except: }
    end Literal `end`
    Negated char class [^}] infinite to 0 times matches any char except: }
    \b Word boundary: match in between (^\w|\w$|\W\w|\w\W)
    \s infinite to 0 times Whitespace [\t \r\n\f] 
    % Literal `%`
    Char class [}] matches one of the following chars: }
    

    PHP示例

    带示例文字 {% start %} this is a sample text 1 {% end %}{% start %} this is a sample text 2 {% end %}

    <?php
    $sourcestring="your source string";
    preg_match_all('/[{]%\s*?\b([^}]*start[^}]*)\b\s*?%[}]\s*?\b(.*?)\b\s*?[{]%\s*\b([^}]*end[^}]*)\b\s*%[}]/i',$sourcestring,$matches);
    echo "<pre>".print_r($matches,true);
    ?>
    
    $matches Array:
    (
        [0] => Array
            (
                [0] => {% start %} this is a sample text 1 {% end %}
                [1] => {% start %} this is a sample text 2 {% end %}
            )
    
        [1] => Array
            (
                [0] => start
                [1] => start
            )
    
        [2] => Array
            (
                [0] => this is a sample text 1
                [1] => this is a sample text 2
            )
    
        [3] => Array
            (
                [0] => end
                [1] => end
            )
    
    )