从Perl的键数组中获取最高的一对值的更好方法是什么?

时间:2018-07-30 15:15:15

标签: arrays perl

从哈希数组中获取最高价值的更好方法是什么?我想从每个文件中获取最高的ID值,即数组中的内容(键是文件名和ID)。

我的@array包含这些值

[
    { file => "messages0.0", id => "1", },
    { file => "messages0.1", id => "2", },
    { file => "messages0.3", id => "3", },
    { file => "messages1.0", id => "1", },
    { file => "messages1.1", id => "2", },
    { file => "messages2.0", id => "1", },
    { file => "messages2.1", id => "1", }
]

如果我使用

my @new_array = sort { $b->{id} <=> $a->{id} } @array; 

如果我的值大于10,那么sort函数将无法正常工作

messages0.0.log;1
messages1.0.log;1
messages2.0.log;1
messages2.1.log;1
messages1.0.log;10
messages1.0.log;11

这是我的数组内容(用;分隔字段以获得更好的视图

messages1.0.log;12
messages1.0.log;11
messages1.0.log;10
messages1.0.log;9
messages0.0.log;8
messages1.0.log;8
messages0.0.log;7
messages1.0.log;7
messages0.0.log;6
messages1.0.log;6
messages0.0.log;5
messages1.0.log;5
messages2.0.log;5
messages2.1.log;5
messages0.0.log;4
messages1.0.log;4
messages2.0.log;4
messages2.1.log;4
messages2.0.log;3
messages2.1.log;3
messages0.0.log;3
messages0.2.log;3
messages0.3.log;3
messages1.0.log;3
messages2.0.log;3
messages2.1.log;3
messages0.3.log;2
messages0.2.log;2
messages0.0.log;2
messages1.0.log;2
messages2.0.log;2
messages2.1.log;2
messages0.0.log;1
messages0.2.log;1
messages0.3.log;1
messages1.0.log;1
messages1.1.log;1
messages2.0.log;1
messages2.1.log;1

我想要的输出是

messages1.0.log;12
messages0.0.log;8
messages2.0.log;5
messages2.1.log;5
messages0.2.log;3
messages0.3.log;3
messages1.1.log;1
#!/usr/bin/perl

use strict;
use warnings;

my $STAT = ".logstatistics";

open( STAT, '>', $STAT ) or die $!;

my @new_array = sort { $b->{id} <=> $a->{id} } @array;

# Print Log statistics
foreach my $entry ( @new_array ) {
    print STAT join ';', $entry->{file}, "$entry->{id}\n";
}

close( STAT );

为了帮助我进行分析,我编写了以下代码以从文件中加载数组

open( STAT, $STAT );

while ( <STAT> ) {
    my @lines = split /\n/;
    my ( $file, $id ) = $lines[0] =~ /\A(.\w.*);(\d.*)/;
    push @array, { file => $file, id => $id, };
}

close( STAT );

我已经通过将if语句加载到@array中来解决了我的问题。 如果文件名的旧值与当前值相同,则将其跳过。 这样,每个文件只有一个值。

2 个答案:

答案 0 :(得分:1)

代替

my @new_array = sort { $a->{id} cmp $b->{id} } @array;

尝试

my @new_array = sort { $a->{id} <=> $b->{id} } @array;

<=>运算符将要比较的字段视为数字而不是字符串。它将10视为大于3,因此将10视为大于03

cmp运算符将您的值视为字符串,因此它将在21之前将3排序在BA之前。

答案 1 :(得分:1)

这似乎可以满足您的要求。

#!/usr/bin/perl

use strict;
use warnings;
use feature 'say';

# This seems to be the data structure that you are working with
my @data = ( {
  file => 'messages1.0.log', id => 12,
}, {
  file => 'messages1.0.log', id => 11,
}, {
  file => 'messages1.0.log', id => 10,
}, {
  file => 'messages1.0.log', id => 9,
}, {
  file => 'messages0.0.log', id => 8,
}, {
  file => 'messages1.0.log', id => 8,
}, {
  file => 'messages0.0.log', id => 7,
}, {
  file => 'messages1.0.log', id => 7,
}, {
  file => 'messages0.0.log', id => 6,
}, {
  file => 'messages1.0.log', id => 6,
}, {
  file => 'messages0.0.log', id => 5,
}, {
  file => 'messages1.0.log', id => 5,
}, {
  file => 'messages2.0.log', id => 5,
}, {
  file => 'messages2.1.log', id => 5,
}, {
  file => 'messages0.0.log', id => 4,
}, {
  file => 'messages1.0.log', id => 4,
}, {
  file => 'messages2.0.log', id => 4,
}, {
  file => 'messages2.1.log', id => 4,
}, {
  file => 'messages2.0.log', id => 3,
}, {
  file => 'messages2.1.log', id => 3,
}, {
  file => 'messages0.0.log', id => 3,
}, {
  file => 'messages0.2.log', id => 3,
}, {
  file => 'messages0.3.log', id => 3,
}, {
  file => 'messages1.0.log', id => 3,
}, {
  file => 'messages2.0.log', id => 3,
}, {
  file => 'messages2.1.log', id => 3,
}, {
  file => 'messages0.3.log', id => 2,
}, {
  file => 'messages0.2.log', id => 2,
}, {
  file => 'messages0.0.log', id => 2,
}, {
  file => 'messages1.0.log', id => 2,
}, {
  file => 'messages2.0.log', id => 2,
}, {
  file => 'messages2.1.log', id => 2,
}, {
  file => 'messages0.0.log', id => 1,
}, {
  file => 'messages0.2.log', id => 1,
}, {
  file => 'messages0.3.log', id => 1,
}, {
  file => 'messages1.0.log', id => 1,
}, {
  file => 'messages1.1.log', id => 1,
}, {
  file => 'messages2.0.log', id => 1,
}, {
  file => 'messages2.1.log', id => 1,
});

my %stats;

# Walk your input data, making a note of the highest
# id associated with every file.
for (@data) {
  if (($stats{$_->{file}} // 0) < $_->{id}) {
    $stats{$_->{file}} = $_->{id};
  }
}

# Walk the %stats hash in sorted order, printing
# the file and the maximum associated id.
for ( sort my_clever_sort keys %stats) {
  say join ';', $_, $stats{$_};
}

# (Slightly) clever sorting algorithm
sub my_clever_sort {
  # Extract the floating point numbers from the filenames
  my ($str_num_a) = $a =~ /(\d+\.\d+)/;
  my ($str_num_b) = $b =~ /(\d+\.\d+)/;

  # Sort by id (descending) and then filename (ascending)
  return ($stats{$b} <=> $stats{$a}) || ($str_num_a <=> $str_num_b);
}