在这种情况下,我有两个文件,cell_list.txt和allcells.txt。在cell_list.txt中列出了所需的单元名称。 例如:
cell_abc
cell_acde
c_swer
然后,我有allcells.txt显示所有细胞的详细信息,超过100个细胞详细信息。我发现该模式看起来非常相似,所有细胞细节都以*****开头,以' END'结束。例如:
*****
Lib: lib_a
Cell: cell_abc
*****
info absw ...
info swea ...
END
*****
Lib: lib_a
Cell: cell_acdd
*****
info awee ...
info awod ...
info acwe ...
END
*****
Lib: lib_b
Cell: cell_acde
*****
info wseo ...
info poee ...
info awec ...
END
*****
Lib: lib_b
Cell: c_swer
*****
info rtoe ...
info swkt ...
END
我需要根据cell_list.txt中列出的单元格获取所有细节,并以某种方式复制到每个单元格的新文件cellname.txt。有没有办法使用csh或perl使这个工作?预期产出如下。
cell_abc.txt的内容:
*****
Lib: lib_a
Cell: cell_abc
*****
info absw ...
info swea ...
END
cell_acde.txt的内容:
*****
Lib: lib_b
Cell: cell_acde
*****
info wseo ...
info poee ...
info awec ...
END
c_swer.txt的内容:
*****
Lib: lib_b
Cell: c_swer
*****
info rtoe ...
info swkt ...
END
这大致取决于我现在的脚本,因为我不熟悉perl。
#!/usr/bin/perl
use strict;
use warnings;
my $file = 'allcells.txt';
my $list = 'cell_list.txt';
my $string;
my @matches = $file =~ m/(^\* .+? END)/g;
{
local $/=undef;
open FILE, $file or die "Couldn't open file: $!";
$string = <FILE>;
close FILE;
while(<>){
if ($string = @matches) #how to check on cell_list.txt if the cell is listed in the file or not before checking the matching string.
{
print $string; #how to extract and print the matching string to new file which will be named based on the cell name listed in cell_list.txt
}
}
}
答案 0 :(得分:0)
您需要首先读取文件,而不是尝试在空字符串上执行正则表达式匹配。迭代其他文件以填充哈希值,并使用哈希成员资格来决定是否将节打印到新文件中。您可以在正则表达式中使用\Q
和\E
进行文字匹配。尾随/s
正则表达式标志将字符串视为一条长行。
#!/usr/bin/env perl
use strict;
use warnings;
my $file = 'allcells.txt';
my $list = 'cell_list.txt';
my %required_cells;
open my $fhrc, "<$list"
or die "Unable to open '$list' : $!";
while ( my $line = <$fhrc> ) {
chomp($line);
$required_cells{ $line } = 1;
}
open my $fh, "<$file"
or die "Unable to open '$file' : $!";
my $allcells_txt = do { local $/; <$fh> }; # Slurp file into a string
my @matches = $allcells_txt =~ m|\Q*****\E.+?\Q*****\E.+?END|gs;
for my $group (@matches) {
my ($cell) = $group =~ m|Cell: (\w+)|s;
if ( exists $required_cells{ $cell } ) {
print "Cell [ $cell ] is required\n";
my $out_name = "$cell.txt";
open my $out, ">$out_name"
or die "Unable to open '$out_name' for writing : $!";
print $out "$group . "\n";
close $out
or die "Unable to close '$out_name' : $!";
print "==> Created $out_name\n";
} else {
print "Skipping $cell\n";
}
}
<强>输出强>
Cell [ cell_abc ] is required
==> Created cell_abc.txt
Skipping cell_acdd
Cell [ cell_acde ] is required
==> Created cell_acde.txt
Cell [ c_swer ] is required
==> Created c_swer.txt