如何单独创建一个输出数组,只得到我输出的一部分

时间:2013-02-23 17:40:00

标签: perl

我使用数组中的foreach循环获取文件的输出。

如何在获取所有内容后,只在新数组中提取文件的一部分。

这是我的代码

foreach (@genetic_codes) {
  chomp;
  my @genetic_codes = split(':', $_);
  if (@genetic_codes != 5) { # error on this line next;
  }
  my $amino_acid = join('","', split(/,/, $genetic_codes[4]));
  print "$genetic_codes[2]=> [$genetic_codes[0],$genetic_codes[1],[$amino_acid]],\n";
}

这是我的输出是正确的。

"M"=> ["Methionine","Met",["ATG"]],
"F"=> ["Phenylalanine","Phe",["TTT"," TTC"]],
"P"=> ["Proline","Pro",["CCT"," CCC"," CCA"," CCG"]],
"S"=> ["Serine","Ser",["TCT"," TCC"," TCA"," TCG"," AGT"," AGC"]],
"T"=> ["Threonine","Thr",["ACT"," ACC"," ACA"," ACG"]],
"W"=> ["Tryptophan","Trp",["TGG"]],

现在我需要拿走所有密码子并将它们放入变量Z中去掉重复项。

我是否需要单独制作一个foreach循环?

我完全迷失了,请帮忙。 我需要我的最后一个输出----

"Z"=>["ACT","AGT",---------------SO ON]],

所有上述三个字母的信件,一个变量。

1 个答案:

答案 0 :(得分:0)

您需要更改代码,以便密码子存储在每行输出的单独数组中。然后,您可以逐行构建哈希。

我还修复了输入的处理,以确保您的数据正确无误。

由于你没有提供任何样本输入数据,我已经编造了一些我认为正确的东西,并产生你在问题中显示的输出。

use strict;
use warnings;

my %codons;

while (<DATA>) {
  chomp;
  my @genetic_codes = split /:/;
  @genetic_codes == 5 or die "Invalid data found";
  my @amino_acids = $genetic_codes[4] =~ /[ACTG]+/g;
  printf "%s => [%s, %s, [%s]],\n",
      @genetic_codes[2, 0, 1],
      join ', ', map qq{"$_"}, @amino_acids;#
  $codons{$_}++ for @amino_acids;
}
printf qq{"%s" => [%s]\n}, 'Z', join ', ', map qq{"$_"}, sort keys %codons;

__DATA__
"Methionine":"Met":"M":"":"ATG"
"Phenylalanine":"Phe":"F":"":"TTT, TTC"
"Proline":"Pro":"P":"":"CCT, CCC, CCA, CCG"
"Serine":"Ser":"S":"":"TCT, TCC, TCA, TCG, AGT, AGC"
"Threonine":"Thr":"T":"":"ACT, ACC, ACA, ACG"
"Tryptophan":"Trp":"W":"":"TGG"

<强>输出

"M" => ["Methionine", "Met", ["ATG"]],
"F" => ["Phenylalanine", "Phe", ["TTT", "TTC"]],
"P" => ["Proline", "Pro", ["CCT", "CCC", "CCA", "CCG"]],
"S" => ["Serine", "Ser", ["TCT", "TCC", "TCA", "TCG", "AGT", "AGC"]],
"T" => ["Threonine", "Thr", ["ACT", "ACC", "ACA", "ACG"]],
"W" => ["Tryptophan", "Trp", ["TGG"]],
"Z" => ["ACA", "ACC", "ACG", "ACT", "AGC", "AGT", "ATG", "CCA", "CCC", "CCG", "CCT", "TCA", "TCC", "TCG", "TCT", "TGG", "TTC", "TTT"]