如何使用Perl从此配置文件中提取块?

时间:2011-11-03 15:18:20

标签: regex arrays perl search

我正在尝试搜索Load Balancer配置并提取一些数据。配置文件如下所示

pool {
name           "POOL_name1"
ttl            30
monitor all "tcp"
preferred      rr
partition "Common"

member         12.24.5.100:80
}

pool {
name           "Pool-name2"
ttl            30
monitor all "https_ignore_dwn"
preferred      rr
fallback       rr
partition "Common"

member         69.241.25.121:8443
member         69.241.25.122:8443
}   

我正在尝试将每个池配置分配给它自己的阵列,因此我可以遍历该阵列以查找特定的IP地址和池名称。我尝试了以下正则表达式,但它不起作用。

my @POOLDATA = <FILE>;
close FILE;
foreach (@POOLDATA) {
  if (/^pool\s\{\s/ .. /^\}\s/) { 
push (@POOLCONFIG, "$_");
}
}

有没有人建议如何将每个池配置分成自己的数组? (或更好的建议)提前感谢您的帮助

3 个答案:

答案 0 :(得分:2)

#!/usr/bin/env perl

use warnings; use strict;

my @pools;

my $keys = join('|', sort
    'name',
    'ttl',
    'monitor all',
    'preferred',
    'partition',
    'member'
);

my $pat = qr/^($keys)\s+([^\n]+)\n\z/;

while ( my $line = <DATA> ) {
    if ($line =~ /^pool\s+{/ ) {
        push @pools, {},
    }
    elsif (my ($key, $value) = ($line =~ $pat)) {
        $value =~ s/^"([^"]+)"\z/$1/;
        push @{ $pools[-1]->{$key} }, $value;
    }
}

use Data::Dumper;
print Dumper \@pools;


__DATA__
pool {
name           "POOL_name1"
ttl            30
monitor all "tcp"
preferred      rr
partition "Common"

member         12.24.5.100:80
}

pool {
name           "Pool-name2"
ttl            30
monitor all "https_ignore_dwn"
preferred      rr
fallback       rr
partition "Common"

member         69.241.25.121:8443
member         69.241.25.122:8443
}

输出:

$VAR1 = [
          {
            'monitor all' => [
                               'tcp'
                             ],
            'member' => [
                          '12.24.5.100:80'
                        ],
            'ttl' => [
                       '30'
                     ],
            'name' => [
                        'POOL_name1'
                      ],
            'preferred' => [
                             'rr'
                           ],
            'partition' => [
                             'Common'
                           ]
          },
          {
            'monitor all' => [
                               'https_ignore_dwn'
                             ],
            'member' => [
                          '69.241.25.121:8443',
                          '69.241.25.122:8443'
                        ],
            'ttl' => [
                       '30'
                     ],
            'name' => [
                        'Pool-name2'
                      ],
            'preferred' => [
                             'rr'
                           ],
            'partition' => [
                             'Common'
                           ]
          }
        ];

编辑:

当然,您可以检查成员元素,如果找不到,则填写默认元素。事实上,有了基本结构,你应该能够自己做到这一点。

一种方法是检查池记录的结尾:

while ( my $line = <DATA> ) {
    if ($line =~ /^pool\s+{/ ) {
        push @pools, {},
    }
    elsif (my ($key, $value) = ($line =~ $pat)) {
        $value =~ s/^"([^"]+)"\z/$1/;
        push @{ $pools[-1]->{$key} }, $value;
    }
    elsif ($line =~ /^\s*}/) {
        my $last = $pools[-1];

        if ($last and not $last->{member}) {
            $last->{member} = [ qw(0.0.0.0) ];
        }
    }

}

答案 1 :(得分:1)

建议使用Sinan Unur,您可以在数组中存储对哈希的引用。这样,数组的每个元素都是一个哈希值。

顺便说一句,Sinan的数据结构有点复杂:你有一个池阵列。每个池都是一个哈希,其密钥是池元素名称的值,而引用是一个数组。这样,池中的每个元素都可以有多个值(就像您的IP地址一样)。

我唯一的评论是我可能会使用哈希来存储池,并通过IP地址键入它。也就是说,假设IP地址对于特定池是唯一的。这样,您可以轻松地通过IP地址提取池,而无需搜索。出于同样的原因,我还会按池名保持并行结构。 (并且,由于每个池都是一个引用,因此通过IP地址和名称存储池不会占用额外的内存。而且,更新一个池会自动更新另一个池。

如果您不熟悉Perl引用,或者如何创建数组或哈希,或者数组哈希,您可以查看以下Perl教程:

使用多层Perl结构后,您可以快速了解如何在Perl脚本中使用面向对象的设计,并使维护这些结构变得非常容易。

答案 2 :(得分:1)

另一种看待它的方式。这个特别处理多个成员字段。

use strict;
use warnings;
use Data::Dumper;
use English         qw<$RS>;
use List::MoreUtils qw<natatime>;
use Params::Util    qw<_ARRAY _CODE>;

# Here, we rig the record separator to break on \n}\n
local $RS = "\n}\n";

# Here, we standardize a behavior with hash duplicate keys
my $TURN_DUPS_INTO_ARRAYS = sub { 
    my ( $hr, $k, $ov, $nv ) = @_;
    if ( _ARRAY( $ov )) { 
      push @{ $ov }, $nv;
    }
    else { 
      $h->{ $k } = [ $ov, $nv ];
    }
};

# Here is a generic hashing routine
# Most of the work is figuring out how the user wants to store values
# and deal with duplicates
sub hash { 
    my ( $code, $param_name, $store_op, $on_duplicate );
    while ( my ( $peek ) = @_ ) {
        if ( $code = _CODE( $peek )) {
            last unless $param_name;

            if ( $param_name eq 'on_dup' ) { 
                $on_duplicate = shift;
            }
            elsif ( $param_name eq 'store' ) { 
                $store_op = shift;
            }
            else { 
                last;
            }
            undef $code;
        }
        else {
            my @c = $peek =~ /^-?(on_dup|store$)/;
            last unless $param_name = $c[0];
            shift;
        }
    }

    $store_op     ||= sub { $_[0]->{ $_[1] } = $_[3]; };
    $on_duplicate ||= $code || $store_op;

    my %h;
    while ( @_ ) { 
        my $k = shift;
        next unless defined( my $v = shift );
        (( exists $h{ $k } and $on_duplicate ) ? $on_duplicate 
        : $store_op 
        )->( \%h, $k, $h{ $k }, $v )
        ;
    }
    return wantarray ? %h : \%h;
}


my %pools;
# So the loop is rather small
while ( <DATA> ) { 
    # remove pool { ... } brackets
    s/\A\s*pool\s+\{\s*\n//smx;
    s/\n\s*\}\n*//smx;
    my $h 
        = hash( -on_duplicate => $TURN_DUPS_INTO_ARRAYS
       ,  map {  s/"$//; s/\s+$//; $_ } 
          map   { split /\s+"|\s{2,}/msx, $_, 2 } 
          split /\n/m
        );
    $pools{ $h->{name} } = $h;
}
print Dumper( \%pools );
### %pools

__DATA__
pool {
name           "POOL_name1"
ttl            30
monitor all "tcp"
preferred      rr
partition "Common"

member         12.24.5.100:80
}

pool {
name           "Pool-name2"
ttl            30
monitor all "https_ignore_dwn"
preferred      rr
fallback       rr
partition "Common"

member         69.241.25.121:8443
member         69.241.25.122:8443
}   

关于hash函数的一个注释,我注意到最近有大量关于哈希处理重复的帖子。这是一般解决方案。