Question

我对Perl很新，所以如果这有些不成熟，我很抱歉。

我正在使用Perl脚本作为Python，文本格式等的包装器，我正在努力获得我想要的输出。

该脚本采用一个文件夹，对于此示例，该文件夹包含6个文本文件（test1.txt到test6.txt）。然后，该脚本从文件中提取一些信息，运行一系列命令行程序，然后输出制表符分隔的结果。但是，该结果仅包含通过脚本处理的其余部分（即结果）的结果。

以下是我目前所拥有的一些片段：

use strict;
use warnings;

## create array to capture all of the file names from the folder
opendir(DIR, $folder) or die "couldn't open $folder: $!\n";
my @filenames = grep { /\.txt$/ } readdir DIR;
closedir DIR;

#here I run some subroutines, the last one looks like this
my $results = `blastn -query $shortname.fasta -db DB/$db -outfmt "6 qseqid sseqid score evalue" -max_target_seqs 1`;
#now I would like to compare what is in the @filenames array with $results

制表符分隔结果示例 - 存储在$results：

中

test1.txt    200    1:1-20      79     80
test3.txt    800    1:1-200     900    80
test5.txt    900    1:1-700     100    2000
test6.txt    600    1:1-1000    200    70

我希望最终输出包含通过脚本运行的所有文件，所以我想我需要一种比较两个数组或者将数组与哈希进行比较的方法？

所需输出的示例：

test1.txt    200    1:1-20      79     80
test2.txt    0      No result
test3.txt    800    1:1-200     900    80
test4.txt    0      No result
test5.txt    900    1:1-700     100    2000
test6.txt    600    1:1-1000    200    70

更新

好的，所以我通过将文件读入哈希然后进行比较，得到了@terdon的建议。所以我试图找出如何通过写入文件和重新读取文件来做到这一点 - 我似乎仍然无法使语法正确。这就是我所拥有的，但似乎我无法将数组与哈希匹配 - 这意味着哈希必须是正确的：

#!/usr/bin/env perl

use strict;
use warnings;

#create variable to mimic blast results
my $blast_results = "file1.ab1  9   350 0.0 449 418 418 403479  403042  567
file3.ab1   2   833 0.0 895 877 877 3717226 3718105 984";

#create array to mimic filename array
my @filenames = ("file1.ab1", "file2.ab1", "file3.ab1");

#header for file
my $header = "Query\tSeq_length\tTarget found\tScore (Bits)\tExpect(E-value)\tAlign-length\tIdentities\tPositives\tChr\tStart\tEnd\n";

#initialize hash
my %hash;
#split blast results into array
my @row = split(/\s+/, $blast_results);
$hash{$row[0]}=$_;
print $header;
foreach my $file (@filenames){
    ## If this filename has an associated entry in the hash, print it
    if(defined($hash{$file})){
        print "$row[0]\t$row[9]\t$row[1]:$row[7]-$row[8]\t$row[2]\t$row[3]\t$row[4]\t$row[5]\t$row[6]\t$row[1]\t$row[7]\t$row[8]\n";
        }
    ## If not, print this.
    else{
        print "$file\t0\tNo Blast Results: Sequencing Rxn Failed\n";
        }
    }
print "-----------------------------------\n";      
print "$blast_results\n"; #test what results look like
print "-----------------------------------\n"; 
print "$row[0]\t$row[1]\n"; #test if array is getting split correctly
print "-----------------------------------\n"; 
print "$filenames[2]\n"; #test if other array present

此脚本的结果是（@filenames数组与哈希值不匹配）：

Query   Seq_length  Target found    Score (Bits)    Expect(E-value) Align-length    Identities  Positives   Chr Start   End
file1.ab1   0   No Blast Results: Sequencing Rxn Failed
file2.ab1   0   No Blast Results: Sequencing Rxn Failed
file3.ab1   0   No Blast Results: Sequencing Rxn Failed
-----------------------------------
file1.ab1   9   350 0.0 449 418 418 403479  403042  567
file3.ab1   2   833 0.0 895 877 877 3717226 3718105 984
-----------------------------------
file1.ab1   9
-----------------------------------
file3.ab1

Answer 1

我不完全确定你需要什么，但是相当于awk的A[$1]=$0是使用Perl中的哈希来完成的。类似的东西：

my %hash;
## Open the output file
open(my $fh, "<","text_file");
while(<$fh>){
    ## remove newlines
    chomp;
    ## split the line
    my @A=split(/\s+/);
    ## Save this in a hash whose keys are the 1st fields and whose
    ## values are the associated lines. 
    $hash{$A[0]}=$_;
}
close($fh);
## Now, compare the file to @filenames
foreach my $file (@filenames){
    ## Print the file name
    print "$file\t";
    ## If this filename has an associated entry in the hash, print it
    if(defined($hash{$file})){
        print "$hash{$file}\n";
    }
    ## If not, print this.
    else{
        print "0\tNo result\n";
    }
}

Perl：如何将数组与hash进行比较并打印出结果

1 个答案: