括号文本捕获(Perl RegEx)

时间:2014-01-28 15:42:39

标签: regex perl

我回来了this question的后续行动。 我们假设我有文字

====Example 1====
Some text that I want to get that
may include line breaks
or special ~!@#$%^&*() characters

====Example 2====
Some more text that I don't want to get.

并使用$output = ($text =~ ====Example 1====\s*(.*?)\s*====);尝试从“====示例1 ====”到“示例2”之前的四个等号中获取所有内容。

基于我所看到的on this siteregexpal.com,并通过自己运行,Perl查找并匹配文本,但$ output保持为null或被指定为“1”。我很确定我在捕捉括号中做错了什么,但我无法弄清楚是什么。任何帮助,将不胜感激。 我的完整代码是:

$text = "====Example 1====\n
Some text that I want to get this text\n
may include line breaks\n
or special ~!@#$%^&*() characters\n
\n
====Example 2====]\n
Some more filler text that I don't want to get.";
my ($output) = $text =~ /====Example 1====\s*(.*?)\s*====/;
die "un-defined" unless defined $output;
print $output;

2 个答案:

答案 0 :(得分:3)

尝试用括号强制列出上下文,并在匹配时使用/s,这样.也可以匹配换行符,

my ($output) = $text =~ / /s;

答案 1 :(得分:1)

两件事。

  1. 将/ s标志应用于正则表达式,让它知道正则表达式的输入可能是多行。
  2. 将括号切换为$output而不是($text ~= regex);
  3. 示例:

    ($output) = $text =~ /====Example\s1====\s*(.*?)\s*====/s;

    例如,将其放入如下脚本:

    #!/usr/bin/env perl
    
    $text="
    ====Example 1====
    Some text that I want to get that
    may include line breaks
    or special ~!@#$%^&*() characters
    
    ====Example 2====
    Some more text that I don't want to get.
    ";
    
    print "full text:","\n";
    &hr;
    print "$text","\n";
    &hr;
    
    ($output) = $text =~ /====Example\s1====\s*(.*?)\s*====/s;
    print "desired output of regex:","\n";
    &hr;
    print "$output","\n";
    &hr;
    
    sub hr {
            print "-" x 80, "\n";
    }
    

    输出如下:

    bash$ perl test.pl
    --------------------------------------------------------------------------------
    full text:
    --------------------------------------------------------------------------------
    
    ====Example 1====
    Some text that I want to get that
    may include line breaks
    or special ~!@#0^&*() characters
    
    ====Example 2====
    Some more text that I don't want to get.
    
    --------------------------------------------------------------------------------
    desired output of regex:
    --------------------------------------------------------------------------------
    Some text that I want to get that
    may include line breaks
    or special ~!@#0^&*() characters
    --------------------------------------------------------------------------------