Question

您好我使用此gawk命令拆分Fasta文件：

gawk '/^>c/ {OUT=substr($0,2) ".fa";print " ">OUT}; OUT{print >OUT}' your_input

它从终端完美运行。我只是想在使用system的perl脚本中使用它并使用字符串作为输入文件，但我不知道该怎么做。

我试过这个：

my $string = "secuence.fa"; #this is the file I wanna split .

my $cmd= (gawk '/^>c/ {OUT=substr($0,2) ".fa";print " ">OUT}; OUT{print >OUT}' $string);
system $command;

当我运行脚本时，它表示我在$cmd中有一些语法错误，但我找不到它。

谢谢。

Answer 1

在perl中拆分FASTA非常简单。 Perl支持在读取文件时更改记录分隔符。如果您将其更改为＆＃34; \ n＆gt;＆＃34;然后perl为你完成所有的工作。

以下是一个例子：

use strict;

# Set the input record separator to the FASTA record separator
local $/ = "\n>";

while (<DATA>) {
    print "---- New sequence ---\n";
    # perl will put the separator at the end of the record,
    # so we need to remove the separator from the end,
    # and add it back at the beginning
    s/[\n>]+$//;
    s/^(?!>)/>/;
    print $_, "\n";
}

__DATA__
>seq1
ACGTACCTA
>seq2
TTCACTTAC
>seq3
ACCTTATTA

制作一个awk命令，将快速拆分为perl中的工作

1 个答案: