Question

我准备了以下脚本，该脚本从我在tsv文件中准备的NCBI中获取了一个GI ID号，并打印了与该ID相关的科学名称：

#!/usr/bin/perl
use strict;
use warnings;
use Bio::DB::Taxonomy;

my ($filename) = @ARGV;
open my $fh, '<', $filename or die qq{Unable to open "$filename": $!};

while(<>) {
        my ($taxonid, $counts) = (split /\t/);
        for my $each($taxonid) {
                print "$each\n";
                my $db = Bio::DB::Taxonomy->new(-source => 'entrez');
                my $taxon = $db->get_taxon(-taxonid => $taxonid);
                print "Taxon ID is $taxon->id, \n";
                print "Scientific name is ", $taxon->scientific_name, "\n";
        }
}

使用此脚本，我收到以下内容：

1760

Taxon ID is Bio::Taxon=HASH(0x33a91f8)->id,

科学名称是Actinobacteria

我想做什么

现在，下一步是让我列出所讨论细菌的完整分类路径。因此，对于上面的示例，我希望将k__Bacteria; p__ Actinobacteria; c__ Actinobacteria视为输出。此外，我希望我的桌子上的GI ID可以用这个完整的分类路径重新进行。

我应该去哪个方向？

Answer 1

首先，我注意到你打开$filename这是你的第一个命令行参数，但是你没有使用你创建的文件指针$fh。

所以，在你的情况下不需要这两行，因为你已经用<>

做了这个技巧

my ($filename) = @ARGV;
open my $fh, '<', $filename or die qq{Unable to open "$filename": $!};

下一步。我不知道你filename和你的数据库里面有什么，所以我无法帮助你。您能举例说明数据库和文件中的内容吗？

还有一件事，我在这里看到的是你可能不需要在循环中创建你的$db实例。

#!/usr/bin/perl
use strict;
use warnings;
use Bio::DB::Taxonomy;

my $db = Bio::DB::Taxonomy->new(-source => 'entrez');

while(<>) {
        my ($taxonid, $counts) = (split /\t/);
        for my $each($taxonid) {
                print "$each\n";
                my $taxon = $db->get_taxon(-taxonid => $taxonid);
                print "Taxon ID is $taxon->id, \n";
                print "Scientific name is ", $taxon->scientific_name, "\n";
        }
}

修改

从您的命令中很难帮助您。当你写

my $taxon = $db->get_taxon(-taxonid => $taxonid);

您收到一个Bio::Taxon节点，其中可以找到文档here

我不知道k__Bacteria; p__ Actinobacteria; c__ Actinobacteria代表什么。它是Bio::Taxon节点提供的信息吗？

无论如何，您仍然可以使用以下方式探索$taxon：

#!/usr/bin/env perl # Author: Yves Chevallier # Date: use strict; use warnings; use Data::Dumper; use Bio::DB::Taxonomy; my $db = Bio::DB::Taxonomy->new(-source => 'entrez'); while(<DATA>) { my ($taxonid, $counts) = (split /\t/); for my $each($taxonid) { print "$each\n"; my $taxon = $db->get_taxon(-taxonid => $taxonid); print Dumper $taxon; print "Taxon ID is $taxon->id, \n"; print "Scientific name is ", $taxon->scientific_name, "\n"; } } __DATA__ 12 1760

显示NCBI GI编号

我想做什么

1 个答案: