我有以下数据集:
$name $id $value ##
A abc 2.1
A pqr 5.9
A xyz 5.6
B twg 2.5
B ysc 4.7
C faa 4.7
C bar 2.4
D foo 1.2
D kar 0.3
D tar 3.5
D zyy 0.1
对于每个$ name,我需要提取具有最高$ value的$ id。我试过这样的事情。
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper qw(Dumper);
my $infile;
my %multi_hash;
open ($infile, "test.txt") || die "can't open $infile\n";
while (my$line=<$infile>) {
my($name,$id,$val)= split(/\t/, $line);
$multi_hash{$name}{$id}=$val;
}
# print Dumper \%multi_hash;
foreach my $name_1(sort keys %multi_hash){
foreach my $id_1 (keys %{$multi_hash{$name_1}}) {
print "$name_1\t$id_1\t$multi_hash{$name_1}{$id_1}";
}
}
我希望输出为:
A pqr 5.9
B ysc 4.7
C faa 4.7
D tar 3.5
我能够打印的内容已经存在于输入文件中。
任何人都可以帮助改进我的计划吗?在此先感谢!!
答案 0 :(得分:0)
使用命令行,
perl -lane'
$_->{m}<$F[2] and @$_{"s","m"} = @F[1,2] for $h{$F[0]};
END {
print join" ", $_, @{$h{$_}}{"s","m"} for sort keys %h
}
' file
输出
A pqr 5.9
B ysc 4.7
C faa 4.7
D tar 3.5
等效脚本:
local $\ = "\n"; # adds newline to print statements
my %h;
while (<>) {
chomp;
my @F = split ' ', $_; # split columns on white spaces
for my $r ($h{$F[0]}) { # from now on, use $r as reference to $h{$F[0]}
if ($r->{m} < $F[2]) {
$r->{s} = $F[1];
$r->{m} = $F[2];
}
}
}
for my $k (sort keys %h) {
my $s = $h{$k}{s};
my $m = $h{$k}{m};
print join " ", $k, $s, $m;
}
答案 1 :(得分:0)
perldoc -q sort
按值对哈希进行排序。
use warnings;
use strict;
my %multi_hash;
while (<DATA>) {
my ($name,$id,$val) = split;
$multi_hash{$name}{$id} = $val;
}
for my $name_1 (sort keys %multi_hash) {
my %h = %{ $multi_hash{$name_1} };
my $key = (reverse sort { $h{$a} <=> $h{$b} } keys %h)[0];
print "$name_1\t$key\t$multi_hash{$name_1}{$key}\n";
}
__DATA__
A abc 2.1
A pqr 5.9
A xyz 5.6
B twg 2.5
B ysc 4.7
C faa 4.7
C bar 2.4
D foo 1.2
D kar 0.3
D tar 3.5
D zyy 0.1
或者,没有中间哈希:
for my $name_1 (sort keys %multi_hash) {
my $key = (reverse sort { $multi_hash{$name_1}{$a} <=> $multi_hash{$name_1}{$b} } keys %{ $multi_hash{$name_1} })[0];
print "$name_1\t$key\t$multi_hash{$name_1}{$key}\n";
}