散列列表的参数化排序

时间:2018-01-03 15:42:15

标签: arrays perl sorting hash

我的目标是编写一个接受

的子程序
  1. 一系列哈希
  2. 包含排序顺序的列表
  3. 为了清楚 - 键可能是任何东西。我的例子仅供参考。

    给定一个包含所需排序顺序的键列表的数组

    $a->{DELTA1_2} <=> $b->{DELTA1_2} or $a->{SET1} <=> $b->{SET1} or $a->{SET2} <=> $b->{SET2}
    

    我的想法是形成一个字符串

    eval

    然后使用my $paRecords = [ { 'SET1' => 48265, 'DELTA1_2' => -1, 'SET2' => 48264 }, { 'SET1' => 8328, 'DELTA1_2' => -29, 'SET2' => 8299 }, { 'SET1' => 20, 'DELTA1_2' => 0, 'SET2' => 0 }, { 'SET1' => 10, 'DELTA1_2' => 0, 'SET2' => 0 } ]; my @aSortOrder = ( 'DELTA1_2', 'SET1', 'SET2' ); my $pStr = ''; foreach ( @aSortOrder ) { $pStr = $pStr . ' or $a->{' . $_ . '} <=> $b->{' . $_ . '}'; } $pStr =~ s/^\s*or\s*//; my @aSorted = sort { eval "$pStr"; } @$paRecords; print Dumper \@aSorted; 执行它。

    这是我的代码

    $VAR1 = [
              {
                'SET1' => 8328,
                'SET2' => 8299,
                'DELTA1_2' => -29
              },
              {
                'SET1' => 48265,
                'SET2' => 48264,
                'DELTA1_2' => -1
              },
              {
                'SET2' => 0,
                'DELTA1_2' => 0,
                'SET1' => 10
              },
              {
                'SET2' => 0,
                'DELTA1_2' => 0,
                'SET1' => 20
              }
            ];
    

    输出

    {{1}}

    我想这远不是解决问题的理想方法,因此任何关于如何更好地解决这个问题的指针都将是一个很大的帮助。

3 个答案:

答案 0 :(得分:3)

只需创建一个进行比较的子。

sub custom_cmp {
   my $keys = shift;
   for my $key (@$keys) {
      my $cmp = $_[0]{$key} <=> $_[1]{$key};
      return $cmp if $cmp;
   }

   return 0;
}

my @aSorted = sort { custom_cmp(\@aSortOrder, $a, $b) } @$paRecords;

上面为每次比较做了两个子调用。如果我们生成比较函数,我们可以将其减少为一个。

sub make_custom_cmp {
   my @keys = @_;
   return sub($$) {
      for my $key (@keys) {
         my $cmp = $_[0]{$key} <=> $_[1]{$key};
         return $cmp if $cmp;
      }

      return 0;
   };
}

my $cmp = make_custom_cmp(@aSortOrder);

my @aSorted = sort $cmp @$paRecords;

我们可以更进一步,通过代码生成来平整循环。这是一个适当的&#34;基于eval的解决方案看起来像。但是,几乎不需要这种级别的优化。

sub make_custom_cmp {
   my @keys = @_;
   my @cmps;
   for $i (0..$#keys) {
      push @cmps, "\$_[0]{\$keys[$i]} <=> \$_[1]{\$keys[$i]}"
   }

   return eval("sub($$) { ".( join(" || ", @cmps) )."}");
}

my $cmp = make_custom_cmp(@aSortOrder);

my @aSorted = sort $cmp @$paRecords;

事实上,以下可能是性能最佳的解决方案:

my @aSorted =
   map $paRecords->[ unpack('N', substr($_, -4))-0x7FFFFFFF ],
      sort
         map pack('N*', map $_+0x7FFFFFFF, @{ $paRecords->[$_] }{@aSortOrder}, $_),
            0..$#$paRecords;

答案 1 :(得分:1)

传递给sort的块可能包含任意数量的代码。根据{{​​1}}是否应被视为小于,等于或大于$a

,仅需要评估为负数,零或正数。

我同意你决定将它捆绑到一个子程序中,所以我写了$b,它要求对一个哈希数组的引用进行排序,并引用一个关键字符串数组。它返回根据键列表排序的哈希列表

sort_hashes_by_keys

输出

use strict;
use warnings 'all';

use Data::Dump 'dd';

my $records =  [
    { SET1 => 48265, DELTA1_2 => -1,  SET2 => 48264 },
    { SET1 => 8328,  DELTA1_2 => -29, SET2 => 8299  },
    { SET1 => 20,    DELTA1_2 => 0,   SET2 => 0     },
    { SET1 => 10,    DELTA1_2 => 0,   SET2 => 0     }
];

my @sort_order = qw/ DELTA1_2 SET1 SET2 /;

my @sorted = sort_hashes_by_keys( $records, \@sort_order );

dd \@sorted;



sub sort_hashes_by_keys {
    my ( $hashes, $order ) = @_;

    sort {

        my $cmp = 0;

        for my $key ( @$order ) {
            last if $cmp = $a->{$key} <=> $b->{$key};
        }

        $cmp;

    } @$hashes;
}


请注意,在命名变量时,我强烈反对匈牙利表示法驼峰案例。 Perl不是严格键入的,它有 sigils ,如[ { DELTA1_2 => -29, SET1 => 8328, SET2 => 8299 }, { DELTA1_2 => -1, SET1 => 48265, SET2 => 48264 }, { DELTA1_2 => 0, SET1 => 10, SET2 => 0 }, { DELTA1_2 => 0, SET1 => 20, SET2 => 0 }, ] $@,表示每个变量的类型,所以匈牙利语符号充其量是多余的,还会增加分散注意力和无关紧要的噪音。此外,按照惯例,大写字母是为模块名称和全局变量保留的,因此本地标识符应该在&#34; snake case&#34;中,即小写字母和下划线。许多非英语使用者也发现骆驼案难以解析

答案 2 :(得分:0)

嗯,你说得对 - 使用eval这样就是通往未来痛苦的道路。

'sort'的乐趣在于你可以定义一个排序子程序,它隐含地定义$a$b,你可以使用你想要的任何逻辑来判断它是正面的,负面的还是'零'比较(相等)。 (例如,如<=>cmp)。

这里的诀窍是 - 'true'是非零的,所以<=>你可以测试'true'以查看是否进行了比较(4 <=> 4是'false')< / p>

因此,如果您只是以数字方式工作(您需要测试'字母数字'并在某些情况下使用cmp,但似乎不适用于您数据):

#!/usr/bin/env perl
use strict;
use warnings;

my $paRecords = [
   { 'SET1' => 48265, 'DELTA1_2' => -1,  'SET2' => 48264 },
   { 'SET1' => 8328,  'DELTA1_2' => -29, 'SET2' => 8299 },
   { 'SET1' => 20,    'DELTA1_2' => 0,   'SET2' => 0 },
   { 'SET1' => 10,    'DELTA1_2' => 0,   'SET2' => 0 }
];

#qw is 'quote-words' and just lets you space delimit terms. 
#it's semantically the same as ( 'DELTA1_2', 'SET1', 'SET2' );
my @order = qw ( DELTA1_2 SET1 SET2 );

#note - needs to come after definition of `@order` but it can be re-written later as long as it's in scope. 
#you can pass an order explicitly into the subroutine if you want though. 
sub order_by {
   for my $key (@order) {
      #compare key
      my $result = $a->{$key} <=> $b->{$key};
      #return it and exit the loop if they aren't equal, otherwise 
      #continue iterating sort terms. 
      return $result if $result;
   }
   return 0; #all keys were similar, therefore return zero.
}

print join (",", @order), "\n";
foreach my $record ( sort {order_by} @$paRecords ) {
   #use hash slice to order output in 'sort order'. 
   #optional, but hopefully clarifies what's going on. 
   print join (",", @{$record}{@order}), "\n";
}

这给出了您的数据输出:

DELTA1_2,SET1,SET2
-29,8328,8299
-1,48265,48264
0,10,0
0,20,0

注意,我选择使用hash slice作为输出,因为否则哈希是无序的,因此您的Dumper输出将不一致(随机排序的字段)。

如果您需要对订购更加动态,可以将其传递给sort-sub:

#!/usr/bin/env perl
use strict;
use warnings;

sub order_by {
   for my $key (@_) {
      #compare key
      my $result = $a->{$key} <=> $b->{$key};

      #return it and exit the loop if they aren't equal, otherwise
      #continue iterating sort terms.
      return $result if $result;
   }
   return 0;    #all keys were similar, therefore return zero.
}

my $paRecords = [
   { 'SET1' => 48265, 'DELTA1_2' => -1,  'SET2' => 48264 },
   { 'SET1' => 8328,  'DELTA1_2' => -29, 'SET2' => 8299 },
   { 'SET1' => 20,    'DELTA1_2' => 0,   'SET2' => 0 },
   { 'SET1' => 10,    'DELTA1_2' => 0,   'SET2' => 0 }
];

#qw is 'quote-words' and just lets you space delimit terms.
#it's semantically the same as ( 'DELTA1_2', 'SET1', 'SET2' );
my @order = qw ( DELTA1_2 SET1 SET2 );

print join( ",", @order ), "\n";
foreach my $record ( sort {order_by ( @order ) } @$paRecords ) {

   #use hash slice to order output in 'sort order'.
   #optional, but hopefully clarifies what's going on.
   print join( ",", @{$record}{@order} ), "\n";
}