代码:
#!/usr/bin/perl
my $file = $ARGV[0];
my $position = $ARGV[1]; # POSITION OF THE RESIDUE
open (FILE, $file);
while (<FILE>) {
my @f = split;
if (($f[0] == "ANNOT_RESID_NO") && ($f[1] == $position)){
push @line, $_;
}
}
print @line;
close(FILE);
INPUT:
ANNOT_TYPE[1] 0
ANNOT_TYPE_NAME[1] CATRES
ANNOT_NUMBER[1][1] 1
ANNOT_NAME[1][1] 3.1.3.16
ANNOT_DESC[1][1] Phosphoprotein phosphatase.
ANNOT_RESID_NO[1][1][1] 91
ANNOT_RESID_NAME[1][1][1] ASP
ANNOT_RESID_NUM[1][1][1] 95
ANNOT_RESID_NO[1][1][2] 92
ANNOT_RESID_NAME[1][1][2] ARG
ANNOT_NRESID[1][1] 6
ANNOT_NUMBER[1][2] 2
ANNOT_NAME[1][2] 3.1.3.53
ANNOT_DESC[1][2] [Myosin-light-chain] phosphatase.
ANNOT_RESID_NO[1][2][1] 91
ANNOT_RESID_NAME[1][2][1] ASP
ANNOT_RESID_NUM[1][2][1] 95
ANNOT_RESID_NO[1][2][2] 92
ANNOT_RESID_NAME[1][2][2] ARG
问题:
我打印的行有$ position(例如91),以“ANNOT_RESID_NO”开头。除了这一行,我还要打印,每次都在@line是包含“ANNOT_DESC”的匹配项上方的第一行。这条“ANNOT_DESC”线不一定总是在我的匹配线上方。
答案 0 :(得分:1)
尝试(完整代码):
#!/usr/bin/perl
use strict;
use warnings;
my $file = $ARGV[0];
my $position = $ARGV[1];
open (FILE, $file) or die $!;
my $desc;
my @line;
while (<FILE>) {
my @f = split " ";
if ( $f[0] =~ /^ANNOT_DESC/ ) {
$desc = $_;
next;
}
if ( $f[0] =~ /^ANNOT_RESID_NO/ and $f[1] == $position ) {
push @line, $desc, $_;
}
}
输出:
ANNOT_DESC[1][1] Phosphoprotein phosphatase.
ANNOT_RESID_NO[1][1][1] 91
ANNOT_DESC[1][2] [Myosin-light-chain] phosphatase.
ANNOT_RESID_NO[1][2][1] 91
答案 1 :(得分:0)
如果数据集很小,您可以将行从文件推送到数组(例如@file_data),迭代@file_data数组并将所需的值推送到@line数组中。