我是Perl的初学者,我想将swiss文件中的一些参数解析为文本。我找到了如何从swiss文件中解析ID的方法,但到目前为止,仅此而已。我必须从文件ID AC中获取。
我的瑞士文件如下:
ID 140U_DROME Reviewed; 261 AA.
AC P81928; Q9VFM8;
SQ SEQUENCE 261 AA; 29182 MW; 5DB78CF6CFC4435A CRC64;
MNFLWKGRRF LIAGILPTFE GAADEIVDKE NKTYKAFLAS KPPEETGLER LKQMFTIDEF
GSISSELNSV YQAGFLGFLI GAIYGGVTQS RVAYMNFMEN NQATAFKSHF DAKKKLQDQF
TVNFAKGGFK WGWRVGLFTT SYFGIITCMS VYRGKSSIYE YLAAGSITGS LYKVSLGLRG
MAAGGIIGGF LGGVAGVTSL LLMKASGTSM EEVRYWQYKW RLDRDENIQQ AFKKLTEDEN
PELFKAHDEK TSEHVSLDTI K
//
我的代码:
open(IN, "<transmem_proteins.swiss") or die "Cant open the file";
open(OUT, ">text.txt") or die "Cant open the file";
while(<IN>){
if($_=~/^ID\s{3}(\S+\s)/){
print OUT ">$1| \n";
print OUT "// \n";
}
}
答案 0 :(得分:0)
以下是如何从swiss文件中提取数据的示例:
use feature qw(say);
use strict;
use warnings;
{
my $data = read_swiss_file();
my @ids;
for my $chunk ( @$data ) {
my ( $item1, $item2, $item3);
if( $chunk =~ /^ID\s{3}(\S+)\s+\S+;\s+(.*)\.\s+$/m ){
$item1 = $1;
$item2 = $2;
$item2 =~ s/\s+//;
}
if( $chunk =~ /^AC\s{3}(\S+);/m ){
$item3 = $1;
}
push @ids, [$item1, $item2, $item3] if defined $item1;
}
my $fn = 'text.txt';
open ( my $fh, '>', $fn ) or die "Could not open file '$fn': $!";
for my $items (@ids) {
say $fh "->", join '|', @$items;
}
close $fh;
}
sub read_swiss_file {
my $fn = 'transmem_proteins.swiss';
open ( my $fh, '<', $fn ) or die "Could not open file '$fn': $!";
my $str = do { local $/; <$fh> };
close $fh;
my @chunks = split /(?m:^\/\/)/, $str;
return \@chunks;
}