下面你将看到我生成的两个数组的内容。如何组合两个数组,删除重复的相同标题,但保持相同的格式 - 几乎像构建矩阵?我目前正在使用网格将数组合并为一个,但它并不是很有效。我没有遇到任何其他可能有帮助的事情,比如拆分,推送等。我在下面显示了我的代码。
输入文件“phred.txt”
"#$%&'()
输入文件“bases.txt”
ABCDEFGH
打印阵列1的输出
Sequence_1
1 2 3 4 5
打印阵列2的输出
Sequence_1
A B C D E
组合两个阵列所需的输出
Sequence_1
1 2 3 4 5
A B C D E
当前使用网格战略的结果
Sequence_1
Sequence_1
1A 2B 3C 4D 5E
当前代码
use warnings;
use strict;
use List::MoreUtils qw(mesh);
open( PHRED, '<', '/path/to/phred.txt' ) or die $!;
open( BASES, '<', '/path/to/bases.txt' ) or die $!;
open( OUT, '>', '/path/to/out.txt' ) or die $!;
my @symbols;
my @bases;
my $count = 0;
my @finalphred;
my @finalbases;
my %hash = (
'"' => "1",
'#' => "2",
'$' => "3",
'%' => "4",
'&' => "5",
q(') => "6",
'(' => "7",
')' => "8"
);
while ( my $fastq = <PHRED> ) {
my $substring = substr( $fastq, 0, 5 );
push( @symbols, $substring );
}
foreach ( @symbols ) {
my @eachsymbol = split //, $_;
$count++;
push( @finalphred, "\n", "Sequence_$count\n" );
foreach my $symbol ( @eachsymbol ) {
if ( exists( $hash{$symbol} ) ) {
push( @finalphred, $hash{$symbol}, "\t" );
}
}
}
my $count_again = 0;
while ( my $fastq_again = <BASES> ) {
my $substring_again = substr( $fastq_again, 0, 5 );
push( @bases, $substring_again );
}
foreach ( @bases ) {
my @eachsymbol_again = split //, $_;
$count_again++;
push( @finalbases, "\n", "Sequence_$count_again\n" );
foreach my $symbol_again (@eachsymbol_again){
push (@finalbases, $symbol_again, "\t");
}
}
foreach (@finalphred){ #diagnostic to test array contents
print "$_";
}
foreach (@finalbases){ #diagnostic to test array contents
print "$_";
}
my @last = mesh @finalphred, @finalbases;
print OUT @last;
感谢您帮我完成此代码并获得正确的输出!
答案 0 :(得分:1)
其中一个主要问题是您从未打印出@eachsymbol_again
的任何内容。您将每个四个字符的字符串拆分为四个字符并将其放入该数组中,然后忽略它。它肯定不会产生你说它的输出。
此外,mesh
是一个奇怪的选择,可以像你那样组合你的数组
作为参考,你的数组看起来像这样
[
"\n",
"Sequence_1\n",
1,
"\t",
2,
"\t",
3,
"\t",
4,
"\t",
"\n",
"Sequence_2\n",
5,
"\t",
6,
"\t",
7,
"\t",
8,
"\t",
)
(
"\n",
"Sequence_1\n",
"\n",
"Sequence_2\n"
)
在这两个数组中你甚至没有相同数量的元素,所以在它们上面调用mesh
没有多大意义
这是一个工作程序
我使用了以下数据
"#$%
&'()
ABCD
EFGH
use strict;
use warnings 'all';
use autodie;
my %xlate = map { chr($_ + 33) => $_ } 1 .. 8;
open my $phred_fh, '<', 'phred.txt';
open my $bases_fh, '<', 'bases.txt';
my $n;
until ( eof $phred_fh or eof $bases_fh ) {
my @syms = map [ split //, substr <$_>, 0, 4 ], $phred_fh, $bases_fh;
printf "Sequence_%d\n", ++$n;
print join("\t", map $xlate{$_}, @{$syms[0]}), "\n";
print join("\t", @{$syms[1]}), "\n";
print "\n";
}
Sequence_1
1 2 3 4
A B C D
Sequence_2
5 6 7 8
E F G H
答案 1 :(得分:0)
我认为你根本不需要使用mesh
来完成这项工作。将文件读入数组处理它们然后用格式化将它们写入文件更为简单。同时,如果文件大小很大以适合主存储器,那么它也可以进行逐行处理。
#!/usr/bin/perl
use warnings;
use strict;
open( PHRED, '<', 'phred.txt' ) or die $!;
open( BASES, '<', 'bases.txt' ) or die $!;
open( OUT, '>', 'out.txt' ) or die $!;
my @finalphred;
my @finalbases;
my %hash = (
'"' => "1",
'#' => "2",
'$' => "3",
'%' => "4",
'&' => "5",
q(') => "6",
'(' => "7",
')' => "8"
);
while ( my $fastq = <PHRED> ) {
chomp $fastq;
my @items = split //, $fastq;
my @phreds = map {$hash{$_}} grep {exists $hash{$_}} @items;
push (@finalphred, \@phreds);
}
while ( my $fastq_again = <BASES> ) {
chomp $fastq_again;
my @items = split //, $fastq_again;
push(@finalbases, \@items);
}
for my $i (0 .. $#finalbases) {
if(@{$finalbases[$i]} && @{$finalphred[$i]}) {
print OUT "Sequence_" . ($i + 1),"\n";
printf OUT "%-6s" x scalar @{$finalphred[$i]},@{$finalphred[$i]};
print OUT "\n";
printf OUT "%-6s" x scalar @{$finalbases[$i]},@{$finalbases[$i]};
print OUT "\n";
}
else {
print "Both arrays doesn't contain equal no of elements\n";
}
}
答案 2 :(得分:0)
以下是Perl 6中的解决方案:
#!/usr/bin/env perl6
subset File of Str where *.IO.f;
sub MAIN (File :$phred='phred.txt', File :$bases='bases.txt') {
my $phred-fh = open $phred;
my $bases-fh = open $bases;
my %xlate = map { chr($_ + 33) => $_ }, 1..8;
for 1..* Z $phred-fh.IO.lines Z $bases-fh.IO.lines -> ($i, $score, $seq) {
put join "\n",·
"Sequence_$i",·
(map { %xlate{$_} }, $score.comb).join("\t"),·
$seq.comb.join("\t");
}
}