说我有一个文件如下
myCategory1
skip some lines from reading
things that I want in the hash for myCategory2
things that I want in the hash for myCategory2
...
myCategory2
skip some lines from reading
things that I want in the hash for myCategory3
things that I want in the hash for myCategory3
...
myCategory3
skip some lines from reading
things that I want in the hash for myCategory1
things that I want in the hash for myCategory1
...
现在,我用filehandle读取文件。伪格式的代码。
while(my $line=<FHIN>){
chomp($line);
if($line=~ /^myCategory1$/){
$line=<FHIN>; # get rid of unwanted line
chomp($line);
$line=<FHIN>; # get rid of unwanted line
chomp($line);
$line=<FHIN>; # this is the line of interest
chomp($line);
do{
@sub_str = split(' ',$line);
$temp_key=$sub_str[2].$sub_str[5].$sub_str[6]; #dummy assignment
$hash{$temp_key}=$sub_str[1]; #dummy assignment
$line=<FHIN>; # this line cotains r.*
chomp($line);
}while((defined($line))&&($line !~ /^myCategory2*$/));
}
if($line=~ /^myCategory2$/){
$line=<FHIN>; # get rid of unwanted line
chomp($line);
$line=<FHIN>; # get rid of unwanted line
chomp($line);
$line=<FHIN>; # this is the line of interest
chomp($line);
do{
@sub_str = split(' ',$line);
$temp_key=$sub_str[2].$sub_str[5].$sub_str[6]; #dummy assignment
$hash{$temp_key}=$sub_str[1]; #dummy assignment
$line=<FHIN>; # this line cotains r.*
chomp($line);
}while((defined($line))&&($line !~ /^myCategory3*$/));
}
}
我们的想法是寻找Category1 ..... 3并在线之间捕获以进行进一步处理。现在,如果我有10个文件要处理每个有50个类别,根据我的代码,我将需要1个while循环,每个循环每个文件有50个if..else块,完全硬编码。如何概括/减少代码中的行数,这些行完全相同。如果我的问题不明确,请随时问! 感谢。
修改 谢谢你的回答。让我重新说一下我需要什么。
答案 0 :(得分:2)
我认为所需的结构是有一个循环遍历文件并使用状态变量来决定其行为的循环。
我更新了更新要求的代码:使用模式识别类别的开头。过滤器(desired_category)允许忽略不相关的类别。
my %hash;
my %desired_categories = map {($_ => 1)} qw(myCategory1 myCategory3);
process_file ('data2.txt');
sub process_file {
my $filename = shift;
my $current_category;
open (my $fhin, '<', $filename) or die "Can't open $filename: $!";
while(my $line=<$fhin>){
chomp($line);
# match pattern of the start of a new category
if ($line =~ m/^myCategory/) {
$current_category = $line;
print "Start $current_category\n";
map {scalar <$fhin>} 0..2; # skip three lines
} elsif (exists $desired_categories{$current_category}) {
my ($val, undef, $key1, undef, undef, $key2, $key3) = split ' ', $line;
$hash{"$key1$key2$key3"} = $val;
print "$key1$key2$key3 -> $val\n";
}
}
}
答案 1 :(得分:1)
您的代码似乎实现了逻辑:
你可以用这样的东西更简单地做到这一点:
use strict;
use warnings;
my %hash;
my %categories = (
'myCategory1' => 1,
'myCategory2' => 1,
'myCategory3' => 1,
);
while(my $line=<FHIN>){
chomp($line);
if(exists $categories{$line}){
#Skip two lines.
<FHIN> for (1..2);
}
else{
my @sub_str = split(' ',$line);
my $temp_key=$sub_str[2].$sub_str[5].$sub_str[6];
$hash{$temp_key}=$sub_str[1];
}
}
或者,如果你想出一个可以检测每个类别的正则表达式,你可以使用它并取消类别哈希。
例如,类别是一个单词在一行中唯一出现的时间,没有空格吗?如果是这样,你可以这样做:
if($line =~ /^\w+$/){
而不是:
if(exists $categories{$line}){