所以我有一个包含2列的文件。我想在文件的第2列中找到所有独特的模式,模式发生的次数以及文件第1列中的相应合作伙伴。
以下是我的文件示例,第1列和第2列由选项卡分隔:
OG5_126538 01111111111110
OG5_126539 01110111110100
OG5_126552 10000000000000
OG5_126558 11111111111111
OG5_126561 11111010000111
OG5_126566 01111011101001
OG5_126569 11111111111110
OG5_126570 11111111111110
OG5_126572 11111111111110
Pattern" 11111111111110"在第2列中出现3次,并且第1列中它的相关伙伴是" OG5_126572,OG5_126570,OG5_126569"。我希望这些信息适用于第2列中的所有独特模式。
我写了一个粘贴在下面的perl程序。但我不断收到错误。我是编程新手。我的计划有什么问题?在此先感谢您的所有帮助。 Perl程序:
#!/usr/local/bin/perl
use strict;
use warnings;
if ( @ARGV < 1 ) {
print "usage: matrix.pl filename\n";
die;
}
my $my_file = shift;
my (%matrix_pattern);
open( SOURCE, $my_file );
while (<SOURCE>) {
chomp;
my ( $group, $pattern ) = split( "\t", $_ );
$matrix_pattern{$group} = $pattern;
$matrix_pattern{$pattern}++;
}
my @unique = values(%matrix_pattern);
my @sorted_unique = sort @unique;
foreach my $unique (@sorted_unique) {
my $test = $matrix_pattern{$unique};
print "$unique $test\n";
}
close SOURCE;
以下是该计划的输出:
01110111110100 1
01111011101001 1
01111111111110 1
Use of uninitialized value $test in concatenation (.) or string at matrix_sample.pl line 27, <SOURCE> line 9.
1
Use of uninitialized value $test in concatenation (.) or string at matrix_sample.pl line 27, <SOURCE> line 9.
1
Use of uninitialized value $test in concatenation (.) or string at matrix_sample.pl line 27, <SOURCE> line 9.
1
Use of uninitialized value $test in concatenation (.) or string at matrix_sample.pl line 27, <SOURCE> line 9.
1
Use of uninitialized value $test in concatenation (.) or string at matrix_sample.pl line 27, <SOURCE> line 9.
1
Use of uninitialized value $test in concatenation (.) or string at matrix_sample.pl line 27, <SOURCE> line 9.
1
10000000000000 1
11111010000111 1
11111111111110 3
11111111111110 3
11111111111110 3
11111111111111 1
Use of uninitialized value $test in concatenation (.) or string at matrix_sample.pl line 27, <SOURCE> line 9.
3
答案 0 :(得分:1)
您尝试使用哈希的values
作为密钥。这是警告的来源。
解决目标的更简单的解决方案是使用数组哈希:
#!/usr/local/bin/perl
use strict;
use warnings;
my $fh = \*DATA;
my %matrix;
while (<$fh>) {
chomp;
my ($group, $pattern) = split ' ';
push @{$matrix{$pattern}}, $group;
}
for my $pattern (sort keys %matrix) {
print $pattern . ' for ' . @{$matrix{$pattern}} . " times. Values are @{$matrix{$pattern}}\n";
}
__DATA__
OG5_126538 01111111111110
OG5_126539 01110111110100
OG5_126552 10000000000000
OG5_126558 11111111111111
OG5_126561 11111010000111
OG5_126566 01111011101001
OG5_126569 11111111111110
OG5_126570 11111111111110
OG5_126572 11111111111110
输出:
01110111110100 for 1 times. Values are OG5_126539
01111011101001 for 1 times. Values are OG5_126566
01111111111110 for 1 times. Values are OG5_126538
10000000000000 for 1 times. Values are OG5_126552
11111010000111 for 1 times. Values are OG5_126561
11111111111110 for 3 times. Values are OG5_126569 OG5_126570 OG5_126572
11111111111111 for 1 times. Values are OG5_126558