我在排序和提取多行文本时遇到了一些麻烦。这是我的代码:
my $searched = $doc->content;
if($searched =~ /MODIFIED files in Task $_[1] : (.*?) The/gs){
print $1,"\n";
$Modified = $1;
}
if($searched =~ m/COMPILED in Task $_[1] : (.*?) The/ms){
$Compiled = $1;
}
if($searched =~ m/DELETED in Task $_[1] : (.*?) Comments/ms){
$Deleted = $1;
}
以下是文本文件的示例:
The following are the MODIFIED files in Task 50104 : **Directory Filename Version --------- -------- ------- Something Something ..... ...... ...... ..... ....... ........ .....** The following are the files to be COMPILED in Task 50104 : **Directory Filename --------- -------- ......... .........** The following are the files to be DELETED in Task 50104 : **Directory Filename --------- --------** Comments: Blah blah.......
**之间的文字是我要提取的内容。抱歉格式不佳
答案 0 :(得分:1)
我不确定您的文字是否包含:
之前和/评论之前的空格(事实上,在我看来,:
后面跟着换行符,并且The
之前换行,而不是空格);而不是使用:
if($searched =~ /MODIFIED files in Task $_[1] : (.*?) The/gs){
尝试使用:
if($searched =~ /MODIFIED files in Task $_[1] :(.*?)The/gs){
我也认为你不需要/ g或/ m开关...
如果这不起作用,我建议您逐步完善正则表达式,即首先确保/MODIFIED files in Task $_[1] :
与:
匹配,然后添加其余内容。
答案 1 :(得分:1)
触发器操作员有左右两侧。一旦左侧评估为真,触发器将保持为真,直到右侧评估为真。
use strict;
use warnings;
my $searched = $doc->content;
my %info; #< Store in a hash >
open my $string, '<', \$searched or die $!;
{
my ( $type, $content );
while ( <$string> ) { # Process $searched line-by-line
if ( /(MODIFIED|COMPILED|DELETED)/ ) {
$type = $1;
}
$content .= $_, next if /^Directory/ .. /^\s*$/ ;
$content =~ s{\s+$}{}; # Don't need that trailing whitespace
if ( defined $type && defined $content ) {
$info{$type} = $content; # Or push @{ $info{$type} }, $content;
undef $type;
undef $content;
}
}
}
答案 2 :(得分:0)
这是一个快速破解(未经测试)。而不是将整个文件读入字符串,而是以逐行模式使用它:
$ script.pl inputfile.txt
my %data;
my $header;
while (<>) {
next if /^\s*$/; # skip empty lines
if (/^The following are /) { # header line
if (/(MODIFIED|COMPILED|DELETED)/) {
$header = $1;
} else { die "Bad header: $_" }
} else { # data line
die "Header expected" unless (defined $header);
$data{$header} .= $_;
}
}