如何匹配由开始/结束标记包装的每一行?

时间:2013-02-24 07:17:15

标签: regex

我想修改我的博客从markdown到html。而且,我使用[crayon lang="cpp"]...[/crayon]来粘贴代码。我希望得到[crayon][/crayon]包裹的每一行,然后在每行的开头添加4个空格。例如:

Some text

[crayon lang="bash"]
#!/bin/bash
[/crayon]

other text

[crayon lang="cpp"]
int main()
{
}
[/crayon]

我希望它是:

Some text

    #!/bin/bash

other text

    int main()
    {
    }

我不知道如何通过regex来做到这一点。有谁可以帮助我?

这是我尝试过的:

  • \[crayon.*?\]([\d\D]*?)\[\/crayon\] \1匹配[crayon][/crayon]包裹的所有行,但我无法添加空格。
  • (?'st'\[crayon.*?\])^.*$(?'-st'\[/crayon\])不匹配

3 个答案:

答案 0 :(得分:1)

(相对)简单的方法是分两步完成:

1

在每行的开头插入4个空格,但在 '[crayon lang="..."]' '[/crayon]'之前只有

pattern     : (?ms)^(?=(?:(?!\[crayon\b).)*\[/crayon])
replacement : '    ' (4 spaces)

2

删除所有'[crayon lang="..."]''[/crayon]'

pattern     : \[/?crayon.*?][ \t]*(\r?\n|$)
replacement : '' (empty string)

PHP演示:

<?php

$text = 'Some text

[crayon lang="bash"]
#!/bin/bash
[/crayon]

other text

[crayon lang="cpp"]
int main()
{
}
[/crayon]';

$text = preg_replace('#^(?=(?:(?!\[crayon\b).)*\[/crayon])#ms', '    ', $text);

$text = preg_replace('#\[/?crayon.*?][ \t]*(\r?\n|$)#', '', $text);

echo "$text\n";

?>

会打印:

Some text

    #!/bin/bash

other text

    int main()
    {
    }

快速解释一下,或许是简洁的正则表达式^(?=(?:(?!\[crayon\b).)*\[/crayon])

^                    # match the start of a line
(?=                  # start positive look ahead
  (?:                #   start group
    (?!\[crayon\b).  #     match any char as long as it doesn't have `[crayon` in front of it
  )*                 #   end group and repeatr it zero or more times
  \[/crayon]         #   match '[/crayon]'
)                    # end positive look ahead

用简单的英文写着:

  

匹配任何一行的开头,如果此行开头前有[/crayon],则位于此行开头和{{之间1}}不能有[/crayon]

答案 1 :(得分:0)

我有个主意。如果您认为可以,可以使用它。

1. Scan line by line:
    a. Look for \[crayon.+\] this pattern
    b. if you don't find this pattern then write the line as it present
    c. if you find this pattern then don't write anything and look for \[\/crayon\] this pattern
    d. until you find this pattern write every line by adding 4 spaces beginning of it.
    e. when you find (c) specified pattern then don't write anything and again start from (a)

答案 2 :(得分:-1)

\[crayon.*?\]\n(.*\n)*?\[\/crayon\]\n怎么样?这种方式\1可以捕获每一行。