Question

这是工作的一部分。在这部分中，我正在尝试编写一个程序来创建哈希。键是文件中的入藏号，值是整行。但是，该计划给了我一个警告。代码是：

#!/usr/bin/perl

#psuedocode:
#open file1, store uniport accesion as key and the line as value
#open file2, store uniport accesion as key and the line as value which lines contain "IDA"
#compare keys in two hashes, find out matched keys
#print out lines from file2 that match

use strict;
use warnings;
use feature qw(say);

my $infile1 = "geneIDs3_MouseToUniProtAccessions.txt";
my $inFH1;
open ($inFH1, "<", $infile1) or die join (" ", "Can't open", $infile1, "for reading:", $!);
my @array1 = <$inFH1>;
close $inFH1;
shift @array1;
my %geneID1;
for ($a = 0; $a < scalar @array1; $a++){
    chomp $array1[$a];
    $array1[$a] =~ /.*?\t(.*?)\t.*/;
    $geneID1{$1} = $array1[$a];
    #say ("$1", '->', "$geneID1{$array1[$a]}");    #test if the hash has been successfully created, however it doesn't
    #say $array1[$a];              #test if the program can recognize the elements, it does
}

文件geneIDs3_MouseToUniProtAccessions.txt包含1,000行，因此警告很多。前两行是：

From    To  Species Gene Name
PNMA3   Q9H0A4  Homo sapiens    paraneoplastic antigen MA3

警告喜欢这样：

Use of uninitialized value within %geneID1 in string at match_for_part_III_10.pl line 24.
Q9H0A4->

我找到了解决方案：改为使用while循环。它不仅有效，而且更优雅。新代码是：

 #!/usr/bin/perl

#psuedocode:
#open file1, store uniport accesion as key and the line as value
#open file2, store uniport accesion as key and the line as value which lines contain "IDA"
#compare keys in two hashes, find out matched keys
#print out lines from file2 that match

use strict;
use warnings;
use feature qw(say);

my $infile1 = "geneIDs3_MouseToUniProtAccessions.txt";
my $inFH1;
open ($inFH1, "<", $infile1) or die join (" ", "Can't open", $infile1, "for reading:", $!);
my %geneID1;

while (<$inFH1>){
    $_ =~ /.*?\t(.*?)\t.*/;
    $geneID1{$1} = $_;
    say ("$1", '->', "$geneID1{$1}");
}
close $inFH1;

谢谢大家的帮助！

Answer 1

#!/usr/bin/perl

use strict;
use warnings;
use feature qw( say );

<>; # Skip header.

my %geneID1;
while (<>) {
   chomp;
   my @fields = split /\t/;
   my $id = $fields[1];
   $geneID1{$id} = $_;
}

say "$_ => $geneID1{$_}" for sort keys %geneID1;

（传递geneIDs3_MouseToUniProtAccessions.txt作为参数。）

Answer 2

很难说出错误是什么，有标签（是标签吗？）和更改问题中的代码。

但是，代码中有许多可以改进的元素

use warnings;
use strict;
use feature 'say';

my $file = 'geneIDs3_MouseToUniProtAccessions.txt';
open my $fh, '<', $file or die "Can't open $file: $!";

my %geneID1;

my $header = <$fh>;    
while (<$fh>) {
    chomp;
    $geneID1{ (split /\t/)[1] } = $_; 
}

say "$_ => $geneID1{$_}" for sort keys %geneID1;

一张“外卡”是您的数据;如果您不确定TAB个字符使用\s+（也匹配标签），因为您只需要第二个字段。默认为split，您可以执行(split)[1]。

对原始代码的评论

只有在有特定原因的情况下才提前阅读文件
声明所有内容，即使某些特殊功能允许您不允许（$a）
尽可能在最小范围内声明并接近所需位置：open my $fh, ...
请勿使用$a之类的特殊变量，除非它们的用途是什么！
几乎不需要C风格的for循环。如果你需要迭代中的索引
```
foreach my $i (0 .. $#ary) { ... }
```
其中$#ary是数组@ary的最后一个元素的索引

如何为perl中的哈希值赋值

2 个答案: