我的输入数据如下。从下面的数据我想要p1 p2 .. p5
和第一列的唯一性,并获得这些数据。
ID M N
cc1 1 p1
cc1 10 p2
cc1 10 p2
cc2 1 p1
cc2 2 p5
cc3 2 p1
cc3 2 p4
我预计结果是
ID M p1 p2 p3 p4 p5
cc1 3 1 2 0 0 0
cc3 2 1 0 0 1 0
cc2 2 1 0 0 0 1
为此,我尝试了hash of hash
和hash
我得到了我期望的输出。但我怀疑是可以通过使用单个哈希来实现这一点。因为相同的数据存储在两个不同的哈希值中。
my (%hash,$hash2);
<$fh>;
while (<$fh>)
{
my($first,$second,$thrid) = split("\t");
$hash{$first}{$thrid}++; #I tried $hash{$first}++{$thrid}++ It throws syntax error
$hash2{$first}++; #it is possible to reduce this hash
}
my @ar = qw(p1 p2 p3 p4 p5);
$, = "\t";
print @ar,"\n";
foreach (keys %hash)
{
print "$_\t$hash2{$_}\t";
foreach my $ary(@ar)
{
if(!$hash{$_}{$ary})
{
print "0\t";
}
else
{
print "$hash{$_}{$ary}\t";
}
}
print "\n";
}
答案 0 :(得分:1)
无需使用2个哈希值。你只能使用哈希哈希。我刚刚修改了你的代码。看下面的代码。
use strict;
use warnings;
my %hash;
<DATA>;
while (<DATA>)
{
chomp;
my($first,$second,$thrid) = split("\t");
$hash{$first}{$thrid}++; #I tried $hash{$first}++{$thrid}++ It throws syntax error
}
my @ar = qw(p1 p2 p3 p4 p5);
$, = "\t";
print @ar,"\n";
foreach (keys %hash)
{
# print "$_\t$hash2{$_}\t";
my @in = values $hash{$_};
my $cnt = eval(join("+",@in));
print "$_\t$cnt\t";
foreach my $ary(@ar)
{
if(!$hash{$_}{$ary})
{
print "0\t";
}
else
{
print "$hash{$_}{$ary}\t";
}
}
print "\n";
}
您有哈希哈希来存储数据。第一个键是id
,第二个键是N
。只需计算id
的值,就可以得出您想要的总值。
答案 1 :(得分:1)
我可能会这样做:
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my %count_of;
#read the header row
chomp( my @header = split ' ', <DATA> );
while (<DATA>) {
my ( $ID, $M, $N ) = split;
$count_of{ $ID }{ $N }++;
}
#print Dumper \%count_of;
#setup the output headers. We could autodetect, but some of these (p3) are entirely empty.
my @p_headers = qw ( p1 p2 p3 p4 p5 );
#if you did want to:
#my @p_headers = sort keys %{{map { $_ => 1 } map { keys %{$count_of{$_}} } keys %count_of }};
#will give p1 p2 p4 p5.
print join "\t", qw ( ID M ), @p_headers, "\n";
foreach my $ID ( sort keys %count_of ) {
my $total = 0;
$total += $_ for values %{ $count_of{$ID} };
print join "\t",
$ID,
$total,
( map { $count_of{$ID}{$_} // 0 } @p_headers ),
"\n";
}
__DATA__
ID M N
cc1 1 p1
cc1 10 p2
cc1 10 p2
cc2 1 p1
cc2 2 p5
cc3 2 p1
cc3 2 p4