readline()Perl中未打开的文件句柄错误

时间:2015-11-08 01:14:15

标签: perl filehandle

我在修复代码中的错误时遇到问题。我正在尝试获取代码来读取输入文件,并仅提取[]之间的内容。但是,我得到的错误是readline() on unopened filehandle ...我不确定我在这里对while ()文件句柄做错了什么。

#!/usr/bin/perl
use warnings;

my $file = '';
my $newfile = '';
open($newfile, '>', 'newmyosin.fasta') or die "Can't create file", $!;
open($file, '<', 'myosin.fasta') or die "Can't open file", $!;

while(<$file>) {
        print;
        chomp;
        if ( $_ =~ /\[(.+)\]/ ) {
                $file = $1;
        }
}

所以,例如:

这将是我的输入文件的一部分:

>gi|115527082|ref|NP_005954.3| myosin-1 [Homo sapiens] 
>gi|226694176|sp|P12882.3|MYH1_HUMAN RecName: Full=Myosin-1; AltName: Full=Myosin heavy chain 1; AltName: Full=Myosin heavy chain 2x; Short=MyHC-2x; AltName: Full=Myosin heavy chain IIx/d; Short=MyHC-IIx/d; AltName: Full=Myosin heavy chain, skeletal muscle, adult 1 [Homo sapiens] 
>gi|119610411|gb|EAW90005.1| hCG1986604, isoform CRA_b [Homo sapiens]
MSSDSEMAIFGEAAPFLRKSERERIEAQNKPFDAKTSVFVVDPKESFVKATVQSREGGKVTAKTEAGATVTVKDDQVFPM
NPPKYDKIEDMAMMTHLHEPAVLYNLKERYAAWMIYTYSGLFCVTVNPYKWLPVYNAEVVTAYRGKKRQEAPPHIFSISD
NAYQFMLTDRENQSILITGESGAGKTVNTKRVIQYFATIAVTGEKKKEEVTSGKMQGTLEDQIISANPLLEAFGNAKTVR
NDNSSRFGKFIRIHFGTTGKLASADIETYLLEKSRVTFQLKAERSYHIFYQIMSNKKPDLIEMLLITTNPYDYAFVSQGE
ITVPSIDDQEELMATDSAIEILGFTSDERVSIYKLTGAVMHYGNMKFKQKQREEQAEPDGTEVADKAAYLQNLNSADLLK
ALCYPRVKVGNEYVTKGQTVQQVYNAVGALAKAVYDKMFLWMVTRINQQLDTKQPRQYFIGVLDIAGFEIFDFNSLEQLC
INFTNEKLQQFFNHHMFVLEQEEYKKEGIEWTFIDFGMDLAACIELIEKPMGIFSILEEECMFPKATDTSFKNKLYEQHL
GKSNNFQKPKPAKGKPEAHFSLIHYAGTVDYNIAGWLDKNKDPLNETVVGLYQKSAMKTLALLFVGATGAEAEAGGGKKG
GKKKGSSFQTVSALFRENLNKLMTNLRSTHPHFVRCIIPNETKTPGAMEHELVLHQLRCNGVLEGIRICRKGFPSRILYA
DFKQRYKVLNASAIPEGQFIDSKKASEKLLGSIDIDHTQYKFGHTKVFFKAGLLGLLEEMRDEKLAQLITRTQAMCRGFL
ARVEYQKMVERRESIFCIQYNVRAFMNVKHWPWMKLYFKIKPLLKSAETEKEMANMKEEFEKTKEELAKTEAKRKELEEK
MVTLMQEKNDLQLQVQAEADSLADAEERCDQLIKTKIQLEAKIKEVTERAEDEEEINAELTAKKRKLEDECSELKKDIDD
LELTLAKVEKEKHATENKVKNLTEEMAGLDETIAKLTKEKKALQEAHQQTLDDLQAEEDKVNTLTKAKIKLEQQVDDLEG
SLEQEKKIRMDLERAKRKLEGDLKLAQESTMDIENDKQQLDEKLKKKEFEMSGLQSKIEDEQALGMQLQKKIKELQARIE
ELEEEIEAERASRAKAEKQRSDLSRELEEISERLEEAGGATSAQIEMNKKREAEFQKMRRDLEEATLQHEATAATLRKKH
ADSVAELGEQIDNLQRVKQKLEKEKSEMKMEIDDLASNMETVSKAKGNLEKMCRALEDQLSEIKTKEEEQQRLINDLTAQ
RARLQTESGEYSRQLDEKDTLVSQLSRGKQAFTQQIEELKRQLEEEIKAKSALAHALQSSRHDCDLLREQYEEEQEAKAE

在此之外,我想创建一个新文件“newmyosin.fasta”,它将在此样本的标题中的括号内提取有机体名称(例如[Homo sapiens]。Perl代码用于从上面带有多个样本的myosin.fasta文件中读入,在括号[]中选择名称,然后写出新文件(例如newmyosin.fasta)。

谢谢!

2 个答案:

答案 0 :(得分:2)

执行此操作时:

$file = $1;

您覆盖了文件句柄。那你就再也看不懂了。你会得到提到的错误。

你当然应该在其他地方保存比赛,例如:

my $match = $1;

也可能打印出来:

print $newfile $match;

答案 1 :(得分:0)

正如我在comment中所说,您在阅读文件的过程中重新将文件句柄分配给捕获组。由于您为输出打开了一个单独的文件,我假设您要将匹配的字符串打印到该文件中。

话虽如此,您的要求非常模糊,您的样本输入看起来并不准确,并且您没有提供任何样本输出,但如果我理解您的意图,我认为这就是您的意思想:

my $file = 'myosin.fasta';
my $tmp = "$file.tmp";

open(my $new, '>', $tmp) or die "Can't open $tmp: $!";
open(my $old, '<', $file) or die "Can't open $file: $!";

while (<$old>) {
    if (/\[([^]]+)\]/) {
        print $new "$1\n";
    }
}

close($old);
close($new);

rename($file, "$file.bak");
rename($tmp, $file);

运行脚本后myosin.fasta的内容:

Homo sapiens
Homo sapiens
Homo sapiens