如何获取有关单元格的所有信息,基于cell_list.txt

时间:2018-05-30 02:51:25

标签: perl pattern-matching csh

在这种情况下,我有两个文件,cell_list.txt和allcells.txt。在cell_list.txt中列出了所需的单元名称。 例如:

cell_abc
cell_acde
c_swer

然后,我有allcells.txt显示所有细胞的详细信息,超过100个细胞详细信息。我发现该模式看起来非常相似,所有细胞细节都以*****开头,以' END'结束。例如:

*****
Lib: lib_a
Cell: cell_abc
*****
info absw ...
info swea ...
END

*****
Lib: lib_a
Cell: cell_acdd
*****
info awee ...
info awod ...
info acwe ...
END

*****
Lib: lib_b
Cell: cell_acde
*****
info wseo ...
info poee ...
info awec ...
END

*****
Lib: lib_b
Cell: c_swer
*****
info rtoe ...
info swkt ...
END

我需要根据cell_list.txt中列出的单元格获取所有细节,并以某种方式复制到每个单元格的新文件cellname.txt。有没有办法使用csh或perl使这个工作?预期产出如下。

cell_abc.txt的内容:

*****
Lib: lib_a
Cell: cell_abc
*****
info absw ...
info swea ...
END

cell_acde.txt的内容:

*****
Lib: lib_b
Cell: cell_acde
*****
info wseo ...
info poee ...
info awec ...
END

c_swer.txt的内容:

*****
Lib: lib_b
Cell: c_swer
*****
info rtoe ...
info swkt ...
END

这大致取决于我现在的脚本,因为我不熟悉perl。

#!/usr/bin/perl     
use strict;
use warnings;

my $file = 'allcells.txt';
my $list = 'cell_list.txt';
my $string;
my @matches = $file =~ m/(^\* .+? END)/g;
{
  local $/=undef;
  open FILE, $file or die "Couldn't open file: $!";
  $string = <FILE>;
  close FILE;

        while(<>){
        if ($string = @matches) #how to check on cell_list.txt if the cell is listed in the file or not before checking the matching string.
        {
                print $string; #how to extract and print the matching string to new file which will be named based on the cell name listed in cell_list.txt
        }
       }
}

1 个答案:

答案 0 :(得分:0)

您需要首先读取文件,而不是尝试在空字符串上执行正则表达式匹配。迭代其他文件以填充哈希值,并使用哈希成员资格来决定是否将节打印到新文件中。您可以在正则表达式中使用\Q\E进行文字匹配。尾随/s正则表达式标志将字符串视为一条长行。

#!/usr/bin/env perl
use strict;
use warnings;

my $file = 'allcells.txt';
my $list = 'cell_list.txt';

my %required_cells;
open my $fhrc, "<$list"
    or die "Unable to open '$list' : $!";
while ( my $line = <$fhrc> ) {
    chomp($line);
    $required_cells{ $line } = 1;
}

open my $fh, "<$file"
    or die "Unable to open '$file' : $!";
my $allcells_txt = do { local $/; <$fh> }; # Slurp file into a string

my @matches = $allcells_txt =~ m|\Q*****\E.+?\Q*****\E.+?END|gs;
for my $group (@matches) {
    my ($cell) = $group =~ m|Cell: (\w+)|s;
    if ( exists $required_cells{ $cell } ) {
        print "Cell [ $cell ] is required\n";
        my $out_name = "$cell.txt";
        open my $out, ">$out_name"
            or die "Unable to open '$out_name' for writing : $!";
        print $out "$group . "\n";
        close $out
            or die "Unable to close '$out_name' : $!";
        print "==> Created $out_name\n";
    } else {
        print "Skipping $cell\n";
    }
}

<强>输出

Cell [ cell_abc ] is required
==> Created cell_abc.txt
Skipping cell_acdd
Cell [ cell_acde ] is required
==> Created cell_acde.txt
Cell [ c_swer ] is required
==> Created c_swer.txt