我合并了2个序列文件,所以我有1个文件,包含2个序列。我已将这两个序列拆分为@char数组 - 因为我后来必须逐个字符地比较它们。然而,其中一个序列是两行。我想使用join函数来组合2行,但我不知道如何。
实施例
seq 1
ACGTATATATTATATCTGGCGCTATCGATGCTATCGAT
CGATGCGCG
seq 2
AGTGAGCGTAGCTAGCGGCGCGATCTAGCTA
我的代码到目前为止
#!usr/bin/perl
use strict;
use warnings;
# open file 1
open (my $seq1, "<", "file1.fa") or die $!;
# open file 2
open (my $seq2, "<", "file2.fa") or die $!;
# open combined file
open (my $combined, ">", "combined.txt") or die $!;
# read file 1, skip header line, write to combined file
while (my $line = <$seq1>) {
if($line =~ />/) {
next;
}
else {
print $combined "$line\n";
}
}
# read file 2, skip header line, write to combined file on new line
while (my $line2 = <$seq2>) {
if ($line2 =~ />/) {
next;
}
else {
print $combined "$line2\n";
}
}
# need to open combined file for reading
open (my $combined2, "<", "combined.txt") or die $!;
# read through combined file line by line
while (my $seqs = <$combined2>) {
chomp($seqs);
# split sequences into characters
my @chars = split(//, $seqs);
# the sequence from file1 is on 2 separate lines. Need to join these
# lines together
答案 0 :(得分:4)
考虑使用Bio::SeqIO来读取你的fasta文件,因为它可以处理多行的序列:
use strict;
use warnings;
use Bio::SeqIO;
my $in = Bio::SeqIO->new( -file => "file1.fa", '-format' => 'Fasta' );
while ( my $seq = $in->next_seq ) {
my $sequence = $seq->seq;
print $sequence, "\n";
}
file1.fa
的内容:
>seq0
FQTWEEFSRAAEKLYLADPMKVRVVLKYRHVDGNLCIKVTDDLVCLVYRTDQAQDVKKIEKF
>seq1
KYRTWEEFTRAAEKLYQADPMKVRVVLKYRHCDGNLCIKVTDDVVCLLYRTDQAQDVKKIEKFHSQLMRLME
LKVTDNKECLKFKTDQAQEAKKMEKLNNIFFTLM
>seq2
EEYQTWEEFARAAEKLYLTDPMKVRVVLKYRHCDGNLCMKVTDDAVCLQYKTDQAQDVKKVEKLHGK
>seq3
MYQVWEEFSRAVEKLYLTDPMKVRVVLKYRHCDGNLCIKVTDNSVCLQYKTDQAQDVK
输出:
FQTWEEFSRAAEKLYLADPMKVRVVLKYRHVDGNLCIKVTDDLVCLVYRTDQAQDVKKIEKF
KYRTWEEFTRAAEKLYQADPMKVRVVLKYRHCDGNLCIKVTDDVVCLLYRTDQAQDVKKIEKFHSQLMRLMELKVTDNKECLKFKTDQAQEAKKMEKLNNIFFTLM
EEYQTWEEFARAAEKLYLTDPMKVRVVLKYRHCDGNLCMKVTDDAVCLQYKTDQAQDVKKVEKLHGK
MYQVWEEFSRAVEKLYLTDPMKVRVVLKYRHCDGNLCIKVTDNSVCLQYKTDQAQDVK
答案 1 :(得分:0)
我假设您的序列被“&gt;”分隔标志,这就是你使用if($ _ =〜/&gt; /)作为船长的原因。如果不是,请回复,我将更改代码。请尝试以下方法:
open (fil1, "<", "file1.fa") or die $!;
# open file 2
open (fil2, "<", "file2.fa") or die $!;
# open combined file
open (combined, ">", "combined.txt") or die $!;
# read file 1, skip header line, write to combined file
while (<fil1>) {
if($_ =~ />/) {
print $combined "\n";
}
else {
print $combined "$line";
}
}
# read file 2, skip header line, write to combined file on new line
while (<fil2>) {
if ($_ =~ />/) {
print $combined "\n";
}
else {
print $combined "$line2";
}
}
如果有不同行的序列,请查看combined.txt。