如何让preg_match不跨越结果?

时间:2011-08-05 08:34:16

标签: php regex parsing

我遇到了问题。我在[%%]之间放了一些东西。我可以毫无问题地抓住[%some var%](它抓住"some var")。

然而,当我有这个:

[%something|[%with parameter that should be parsed before/after as well%]%]

然后它抓住了

"something|[%with parameter that should be parsed before/after as well"

我该如何解决这个问题?在我看来,我可以先检查[% %]%]匹配,但它就像

一样没有解决方案
[%something|[%with parameter that should be parsed before/after as well%] and something unparsed%]

也许我可以设法重写正则表达式,以便在[%%]时忽略,但不是另一个[%%]。无论如何,我对regexp的了解很差,但我决定使用regexps而不是strpos的...


EDIT0

好吧,我更喜欢使用while循环替换[%%]内部没有其他[%或%]的内容......

我的意思是:

例如: 我确定了一些替代品:

post.date = 1. 1. 1970
post.time = 00:00:00
post.creator.name = John Smith
post.creator.age = (computed) 64

已经创建了替换函数(如果没有递归,则可以正常工作)。这个

[%<replacement variable name>|(optional) prefix|(optional) suffix%]

结果:

prefix<replacement variable's value>suffix

这已经有效了。

示例文字:

"This post was created on [%post.date%][%post.time%| at ][%post.creator.name| by [%post.creator.age||years old %]user%]."

因此循环应该带有示例文本:

Step 0: "This post was created on 1. 1. 1970[%post.time%| at ][%post.creator.name| by [%post.creator.age||years old %]user%]."
Step 1: "This post was created on 1. 1. 1970 at 00:00:00[%post.creator.name| by [%post.creator.age|| years old %]user%]."
Step 2: "This post was created on 1. 1. 1970 at 00:00:00[%post.creator.name| by 64 years old user%]."
Step 3: "This post was created on 1. 1. 1970 at 00:00:00 by 64 years old user John Smith."

希望你现在明白这一点。


EDIT1

可能我只是不需要regexps,因为这太复杂了。也许我只需要编写自己的解析器。在它命中%]之后,它基本上会检查是否没有双重,未关闭[%之前。是的...那应该是诀窍,但请尽管仍然试着帮助我。 :)谢谢!


EDIT2

终于得到了溶剂!

现在确实

This is a post[%post.date| on %][%post.time| at %][%post.creator.name| by [%post.creator.age|| years old %]user %].
Step 0: This is a post on 1. 1. 1970[%post.time| at %][%post.creator.name| by [%post.creator.age|| years old %]user %].
Step 1: This is a post on 1. 1. 1970 at 00:00:00[%post.creator.name| by [%post.creator.age|| years old %]user %].
Step 2: This is a post on 1. 1. 1970 at 00:00:00[%post.creator.name| by 64 years old user %].
Step 3: This is a post on 1. 1. 1970 at 00:00:00 by 64 years old user John Smith.

Sam Graham的建议100%适用于小编辑。谢谢,Sam Graham!

3 个答案:

答案 0 :(得分:1)

我会使用(\[%((?:[^\[]|\[(?!%))*?)%\])

要打破它:

(        // Start capturing group 1 for the entire [%...%] block
\[%      // Match a literal [%
(        // Start capturing group 2 for the inner contents of the [%...%] block
(?:      // Start a non-capturing group of alternative matches
[^\[]    // Match anything that isn't a literal [
|        // or
\[(?!%)  // Match a literal [ that isn't followed by a %
)        // End list of alternative matches
*?       // Match as few as possible of the previous item
)        // End capture group 2
%\]      // Match a literal %]
)        // End capture group 1

或者用英语说,匹配[%然后任何不是另一个[%直到你找到第一个%],记住[%%]中间的位和整个事物,包括[% %]。

您可以使用以下php脚本进行测试:

<?         
$tests = array(
    "[%something|[%with parameter that should be parsed before/after as well%]%]",
    "[%something%][%something else%]",
    );

foreach ($tests as $test) {
    echo "Testing $test:\n";
    $loop = 0;
    while (preg_match("/(\[%((?:[^\[]|\[(?!%))*?)%\])/", $test, $matches)) {
        $loop++; 
        echo "  First loop, looking at $test:\n";
        echo "    group 1: $matches[1]\n    group 2: $matches[2]\n";
        //  Do whatever here...
        $test = str_replace($matches[1], "REPLACED!", $test);
        echo "    replaced: $test\n";
    }
}
?>

应该给你输出:

Testing [%something|[%with parameter that should be parsed before/after as well%]%]:
  First loop, looking at [%something|[%with parameter that should be parsed before/after as well%]%]:
    group 1: [%with parameter that should be parsed before/after as well%]
    group 2: with parameter that should be parsed before/after as well
    replaced: [%something|REPLACED!%]
  First loop, looking at [%something|REPLACED!%]:
    group 1: [%something|REPLACED!%]
    group 2: something|REPLACED!
    replaced: REPLACED!
Testing [%something%][%something else%]:
  First loop, looking at [%something%][%something else%]:
    group 1: [%something%]
    group 2: something
    replaced: REPLACED![%something else%]
  First loop, looking at REPLACED![%something else%]:
    group 1: [%something else%]
    group 2: something else
    replaced: REPLACED!REPLACED!

答案 1 :(得分:1)

如果您准备为每个嵌套级别执行一个正则表达式循环(根据您的编辑建议),那么很容易:

$result = preg_replace_callback(
    '/\[%     # Match [%
    (         # Match and capture...
     (?:      # the following:
      (?!     # If the next part of the string is neither...
       \[%    #  [%
      |       # nor
       %\]    #  %]
      )       # (End of lookahead)
      .       # then match any character.
     )*       # Do this any number of times.
    )         # End of capturing group.
    %\]       # Match %]
    /x', 
    'compute_replacement', $subject);

function compute_replacement($groups) {
    // $groups[1] holds the text between [%...%]
    return 'myreplacement';
}

对每个嵌套级别执行一次。

答案 2 :(得分:0)

匹配您可以使用的嵌套标签:

\[%((?:[^[%]++|\[(?!%)|%(?!])|(?R))*)%]