如何为perl中的哈希值赋值

时间:2017-10-27 02:55:49

标签: perl

这是工作的一部分。在这部分中,我正在尝试编写一个程序来创建哈希。键是文件中的入藏号,值是整行。但是,该计划给了我一个警告。代码是:

#!/usr/bin/perl

#psuedocode:
#open file1, store uniport accesion as key and the line as value
#open file2, store uniport accesion as key and the line as value which lines contain "IDA"
#compare keys in two hashes, find out matched keys
#print out lines from file2 that match

use strict;
use warnings;
use feature qw(say);

my $infile1 = "geneIDs3_MouseToUniProtAccessions.txt";
my $inFH1;
open ($inFH1, "<", $infile1) or die join (" ", "Can't open", $infile1, "for reading:", $!);
my @array1 = <$inFH1>;
close $inFH1;
shift @array1;
my %geneID1;
for ($a = 0; $a < scalar @array1; $a++){
    chomp $array1[$a];
    $array1[$a] =~ /.*?\t(.*?)\t.*/;
    $geneID1{$1} = $array1[$a];
    #say ("$1", '->', "$geneID1{$array1[$a]}");    #test if the hash has been successfully created, however it doesn't
    #say $array1[$a];              #test if the program can recognize the elements, it does
}

文件geneIDs3_MouseToUniProtAccessions.txt包含1,000行,因此警告很多。前两行是:

From    To  Species Gene Name
PNMA3   Q9H0A4  Homo sapiens    paraneoplastic antigen MA3

警告喜欢这样:

Use of uninitialized value within %geneID1 in string at match_for_part_III_10.pl line 24.
Q9H0A4->

我找到了解决方案:改为使用while循环。它不仅有效,而且更优雅。新代码是:

 #!/usr/bin/perl

#psuedocode:
#open file1, store uniport accesion as key and the line as value
#open file2, store uniport accesion as key and the line as value which lines contain "IDA"
#compare keys in two hashes, find out matched keys
#print out lines from file2 that match

use strict;
use warnings;
use feature qw(say);

my $infile1 = "geneIDs3_MouseToUniProtAccessions.txt";
my $inFH1;
open ($inFH1, "<", $infile1) or die join (" ", "Can't open", $infile1, "for reading:", $!);
my %geneID1;

while (<$inFH1>){
    $_ =~ /.*?\t(.*?)\t.*/;
    $geneID1{$1} = $_;
    say ("$1", '->', "$geneID1{$1}");
}
close $inFH1;

谢谢大家的帮助!

2 个答案:

答案 0 :(得分:3)

#!/usr/bin/perl

use strict;
use warnings;
use feature qw( say );

<>; # Skip header.

my %geneID1;
while (<>) {
   chomp;
   my @fields = split /\t/;
   my $id = $fields[1];
   $geneID1{$id} = $_;
}

say "$_ => $geneID1{$_}" for sort keys %geneID1;

(传递geneIDs3_MouseToUniProtAccessions.txt作为参数。)

答案 1 :(得分:2)

很难说出错误是什么,有标签(是标签吗?)和更改问题中的代码。

但是,代码中有许多可以改进的元素

use warnings;
use strict;
use feature 'say';

my $file = 'geneIDs3_MouseToUniProtAccessions.txt';
open my $fh, '<', $file or die "Can't open $file: $!";

my %geneID1;

my $header = <$fh>;    
while (<$fh>) {
    chomp;
    $geneID1{ (split /\t/)[1] } = $_; 
}

say "$_ => $geneID1{$_}" for sort keys %geneID1;

一张“外卡”是您的数据;如果您不确定TAB个字符使用\s+(也匹配标签),因为您只需要第二个字段。默认为split,您可以执行(split)[1]

对原始代码的评论

  • 只有在有特定原因的情况下才提前阅读文件

  • 声明所有内容,即使某些特殊功能允许您不允许($a

  • 尽可能在最小范围内声明并接近所需位置:open my $fh, ...

  • 请勿使用$a之类的特殊变量,除非它们的用途是什么!

  • 几乎不需要C风格的for循环。如果你需要迭代中的索引

    foreach my $i (0 .. $#ary) { ... }
    

    其中$#ary是数组@ary的最后一个元素的索引