需要使用perl匹配多行字符串

时间:2014-09-09 00:32:42

标签: regex perl

我知道在这个论坛中已经多次询问过这个问题,我检查了大部分问题的答案,但仍然无法使其适用于我的输入文件:

输入文件:

flow items
{
Flow Item1
{
    Result -2
    {
        line1;
        line2;
        line3;
    }       

    Result 0
    {
        line4;
        START_TEXT blah blah blah;
        blah blah blah_END_TEXT;
        line4;
    }       
    Result 1
    {
        line5;
        START_TEXT foo foo foo;
        line5;
    }       
    Result 2
    {
        line6;
        START_TEXT blah1 blah1 blah1;
        blah1 blah1 blah1_END_TEXT;
        line7;
    }       
}   

Flow item2
{
    Result -2
    {
        line8;
        line9;
        line10;
    }       

    Result 0
    {
        line11;
        START_TEXT blah2 blah2 blah2;
        blah2 blah2 blah2_END_TEXT;
        line12;
    }       
    Result 1
    {
        line13;
        START_TEXT foo1 foo1 foo1;
        line14;
    }       
}
}

预期输出:

START_TEXT blah blah blah;
blah blah blah_END_TEXT;
START_TEXT blah1 blah1 blah1;
blah1 blah1 blah1_END_TEXT;
START_TEXT blah2 blah2 blah2;
blah2 blah2 blah2_END_TEXT;

所以基本上会出现多次不需要的START_TEXT(那些在同一个Result块中没有END_TEXT的那个,例如:

    Result 1
    {
        line5;
        START_TEXT foo foo foo;
        line5;
    }       

我一直在尝试的代码(查看工作目录中的所有.txt文件并搜索多行模式:

#!/usr/bin/perl
use strict;
use warnings;
use Cwd;
use File::Find;

my $file_pattern  ="txt";

find(\&d, cwd);

sub d {

  my $file = $File::Find::name;

  $file =~ s,/,\\,g;

  return unless -f $file;
  return unless $file =~ /$file_pattern/;

  open F, $file or print "couldn't open $file\n" && return;
  open $fout, ">>", "Result.txt";
  while (<F>) {

    $string =~ /(START_TEXT.*?)END_TEXT/s;
    print $fout "$string\n";;

  }

  close F;
  close $fout;
}

这只返回几个换行符。请帮忙。

1 个答案:

答案 0 :(得分:2)

您可以将Input Record Separator设置为}

use strict;
use warnings;
use autodie;

open my $fh, '<', 'input.file';

local $/ = "}";

while(<$fh>) {
    next unless my ( $cap ) = /(START_TEXT.*END_TEXT;)/s;
    $cap =~ s/\n\s*/\n/g;
    print $cap, "\n";
}

输出:

START_TEXT blah blah blah;
blah blah blah_END_TEXT;
START_TEXT blah1 blah1 blah1;
blah1 blah1 blah1_END_TEXT;
START_TEXT blah2 blah2 blah2;
blah2 blah2 blah2_END_TEXT;