匹配上面的线和打印线

时间:2012-08-09 16:25:02

标签: perl pattern-matching match

代码:

#!/usr/bin/perl

my $file = $ARGV[0];
my $position = $ARGV[1]; # POSITION OF THE RESIDUE

open (FILE, $file);

while (<FILE>) {
my @f = split;
if (($f[0] == "ANNOT_RESID_NO") && ($f[1] == $position)){
    push @line, $_;
}
}
print @line;
close(FILE);

INPUT:

ANNOT_TYPE[1] 0
ANNOT_TYPE_NAME[1] CATRES
ANNOT_NUMBER[1][1] 1
ANNOT_NAME[1][1] 3.1.3.16
ANNOT_DESC[1][1] Phosphoprotein phosphatase.
ANNOT_RESID_NO[1][1][1] 91
ANNOT_RESID_NAME[1][1][1] ASP
ANNOT_RESID_NUM[1][1][1]   95 
ANNOT_RESID_NO[1][1][2] 92
ANNOT_RESID_NAME[1][1][2] ARG
ANNOT_NRESID[1][1] 6
ANNOT_NUMBER[1][2] 2
ANNOT_NAME[1][2] 3.1.3.53
ANNOT_DESC[1][2] [Myosin-light-chain] phosphatase.
ANNOT_RESID_NO[1][2][1] 91
ANNOT_RESID_NAME[1][2][1] ASP
ANNOT_RESID_NUM[1][2][1]   95 
ANNOT_RESID_NO[1][2][2] 92
ANNOT_RESID_NAME[1][2][2] ARG

问题:

我打印的行有$ position(例如91),以“ANNOT_RESID_NO”开头。除了这一行,我还要打印,每次都在@line是包含“ANNOT_DESC”的匹配项上方的第一行。这条“ANNOT_DESC”线不一定总是在我的匹配线上方。

2 个答案:

答案 0 :(得分:1)

尝试(完整代码):

#!/usr/bin/perl

use strict;
use warnings;

my $file = $ARGV[0];
my $position = $ARGV[1];

open (FILE, $file) or die $!;

my $desc;

my @line;

while (<FILE>) {
    my @f = split " ";

    if ( $f[0] =~ /^ANNOT_DESC/ ) {
        $desc = $_;
        next;
    }


    if ( $f[0] =~ /^ANNOT_RESID_NO/  and $f[1] == $position ) {
        push @line, $desc, $_;
    }
}

输出:

ANNOT_DESC[1][1] Phosphoprotein phosphatase.
ANNOT_RESID_NO[1][1][1] 91
ANNOT_DESC[1][2] [Myosin-light-chain] phosphatase.
ANNOT_RESID_NO[1][2][1] 91

答案 1 :(得分:0)

如果数据集很小,您可以将行从文件推送到数组(例如@file_data),迭代@file_data数组并将所需的值推送到@line数组中。