使用子例程perl将脚本组合成1个脚本

时间:2016-12-05 00:48:39

标签: perl bioinformatics

我正在尝试通过启动子例程将几个脚本合并到一个脚本中。问题是我无法将输入从1个下标指向另一个下标。这需要针对多个脚本完成。这些是整个列表中的前2个脚本。

代码1生成的数据需要提交给代码2,依此类推。但在代码2中,还有一个步骤是将生成的文件与原始文件进行比较。

Code 1: 
subst_head_1($infile);


sub subst_head_1
{
    ##this code helps organise the file in a way that it makes it more convenient for the file to be pushed into a hash for later analysis

    ##opening file
    my $i = $_[0];

    open(IN, "<$i") || die "\n Error: Cannot open the infile: $infile\n";
   # open(OUT, ">op.fa");

    ##giving all the headers in the original file line numbers
    my $lineno = 1;

    while(<IN>)
    {
    chomp;
    if ($_ =~ />/)
    {

        $_ = $lineno++,"\t", $_ ,"\n";

        subst_head_2($_);

    }
    }

}
    ##file organised in the following format; eg., "2>CBB_deg7180000000601_1100_2101_3"

sub subst_head_2
{

    ##opening files with header information(result of head-subs-1) and the original sequence(submitted query file) file for further info
    my $i = $_[0];
    #print $i;
    my $i_1 = $_[1];


    ##pushing file(headerinfo.txt) with the header information into a hash
    open(IN, "<$i");
    my @file = <IN>;

    my $file2 = join('', @file);

    my %hash = split(/[\t\n]/, $file2);

    ##opening the original file with the sequence information into an array
    open(IN1, "<$i_1");
    my @fila = <IN1>;


    ##foreach of the sequnces in the sequence file
    foreach my $fila(@fila)
    {

    ##Substituting any "*" in the file, if any, especially at the end of some of sequnces which were present in the file

    $fila =~ s/\*//g;

    ##regex for matching with the header information in the file with all the query information

    if($fila =~ /^\>(\S+).*/)
    {

        ##putting info(eg., CBB_deg7180000000601_1100_2101_3) into a variable $user

        my $user = $1;

        foreach my $has(sort keys %hash)
        {

        ##regex for the values in the key-value relationship in the headerinfo file
        if($hash{$has} =~ /^\>(\S+).*/)
        {

            ##putting info(eg., CBB_deg7180000000601_1100_2101_3) into a variable $user1
            my $user1 = $1;

            ##is the info the same?; if it is, then substitute it in the original with key from headerinfo.txt

            if($user eq $user1)
            {

            ##substitute header in the original file with the unique number;

            $fila =~ s/^\>(\S+).*\n/>$has\n/;

            }
        }
        }
    }
    }

    print @fila;
}

1 个答案:

答案 0 :(得分:1)

我的回答将解释我在您的代码中看到的最直接的问题。

而不是这一行

$_ = $lineno++,"\t", $_ ,"\n";

在更改程序之前,你可能已经有了这个。

print $lineno++,"\t", $_ ,"\n";

您所做的就是将其更改为将所有内容都放入$_的作业。但该变量是标量。这意味着它是一个单一的价值。它不能列出清单。您的作业会将=右侧的第一个内容放入$_。这是$lineno++的结果,即$lineno$var++增量但返回旧值。其余的都被丢弃了。

现在,您的通话subst_head_2($_)只有行号。但是在子程序中你期待两个参数。

my $i = $_[0];
my $i_1 = $_[1];

第二个$i_1(变量的可怕的名称)是undef,因此您无法将其用作文件名来打开文件。

但遗憾的是,您的描述缺乏大量信息,因此我无法告诉您实际需要做什么。请提供示例输入数据和输出数据,并考虑您希望代码执行的操作。