Question

我有一个像这样的文本字符串：

def ctext = """This is the normal text.
This is the again normal text.
<code>int main(){
printf('Hello World!\n');
return 0;}
</code>

This is the again normal text.
This is the again normal text.

<code>
public static void main (String args[]){
System.out.println('Hello World!\n');
return 0;}
</code>

The last line ....
"""

我希望通过将'<code>(.*)</code>'部分传递给像doBeautify(codeText)这样的方法来替换def matches = ctext =~ /<code>(.*)<\/code>/部分之间所有文本的出现。

我这样想，但没有运气：

{{1}}

任何帮助appriciated。感谢

Answer 1

默认情况下，.与\r和\n不匹配。尝试：

def matches = ctext =~ /(?s)<code>(.*?)<\/code>/

其中(?s)被称为DOT-ALL修饰符（使.匹配任何东西）。我还在.*之后放置了?非贪婪。否则，它会匹配第一个<code>和最后一个</code>（以及介于两者之间的所有内容）。

如果您的输入如下所示，请确认您的正则表达式会中断：

<code>int main(){
printf('Hello </code> World!\n');
</code>

仅举出众多角落案例中的一个。在这种情况下，您需要一个适合您语言的解析器。

修改

一个小小的演示：

def ctext = """This is the normal text.
This is the again normal text.
<code>int main(){
printf('Hello World!\\n');
return 0;}
</code>

This is the again normal text.
This is the again normal text.

<code>
public static void main (String args[]){
System.out.println('Hello World!\\n');
return 0;}
</code>

The last line ....
"""

def matches = ctext =~ /(?s)<code>(.*?)<\/code>/
matches.each { println it[1] }

产生

int main(){
printf('Hello World!\n');
return 0;}


public static void main (String args[]){
System.out.println('Hello World!\n');
return 0;}

可以在http://ideone.com/JQ0Ck

上进行测试

Answer 2

~~您是否尝试过多行正则表达式的(?m)修饰符？~~

Bart Kiers在评论中提到这没用，你必须使用dot-all修饰符(?s)。谢谢你指出Bart。

Groovy Regular匹配标签之间的所有内容

2 个答案:

修改