Perl-嵌套数组作为哈希值

时间:2018-07-24 18:43:56

标签: perl

我对Perl比较陌生,我想创建一个Hash,其值是一个数组,其中一个元素是另一个数组。看起来像这样:

my_hash{key} = [ele1, ele2, [arr_ele1, arr_ele2]];

这就是我在做什么:

use Data::Dumper;

my @fields;
my @child_ids;
my $children;
my %spr_hash;

# skipping header
my $header = <$data>;

while(my $line = <$data>) {
    chomp $line;

    # my file is ; deliminated
    @fields = split ";" , $line;
    @child_ids = ();
    # field 6 is a list of 0+ numbers separated by either space or ,
    $children = $fields[6];

    # if children field is not empty
    if ($children) {
        # remove any text
        $children =~ s/[a-zA-Z]//g;

        # if commas are in the field, split on comma
        # if no comma and no space, assume only 1 entry
        # else split on whitespace

        if (index($children, ",") != -1) {
            @child_ids = split "," , $children;
        } elsif(index($children, " ") != -1) {
            push @child_ids, $children;
        } else {
            @child_ids = split ' ' , $children;
        }
        print @child_ids;
        print "\n";
    }

# ASSIGN
    $spr_hash{$fields[0]} = [$fields[1], $fields[2], $fields[3], $fields[4], $fields[5], @child_ids, $fields[7], $fields[8]];

 }

我的问题是,当我输入以下内容时:

id;date;p1;owner;description;status;1, 2;sVal;xVal

我得到以下信息:

print Dumper($spr_hash{"id"})

$VAR1 = [
      'date',
      'p1',
      'owner',
      'description',
      'status',
      '1',
      ' 2',
     'sVal',
      'xVal'
    ];

c1和c2成为两个单独的条目,而不是1个数组条目。

我如何产生输出:

$VAR1 = [
      'date',
      'p1',
      'owner',
      'description',
      'status',
      [1, 2],
     'sVal',
      'xVal'
    ];

3 个答案:

答案 0 :(得分:1)

这是你应该做的

use Data::Dumper;

my @fields;
my $children;
my %spr_hash;

# skipping header
my $header = <$data>;

while(my $line = <$data>) {
    chomp $line;

    # my file is ; deliminated
    @fields = split ";" , $line;

    # Create a new version of @child_ids for each iteration of the loop
    my @child_ids = ();
    # field 6 is a list of 0+ numbers separated by either space or ,
    $children = $fields[6];

    # if children field is not empty, but can't use if ($children) as this will 
    # not allow a single 0 to be a valid input
    if (length $children) {
        # Only want digits and delimiters so strip out everything else. 
        # Always work out what you want to keep as the set of stuff you
        # want to remove is usually wrong. The original version would have 
        # kept in the string things like $, ! or é
        $children =~ s/[^ ,0-9]//g;

        # Split on comma or white space
        # Split takes a regex so you can do the split in one go
        # This assumes that you don't have data in the field like '98 ,32, 33'
        # If you do then change /[ ,]/ to /[ ,]+/
        @child_ids = split /[ ,]/, $children;
        print @child_ids;
        print "\n";
    }

    # ASSIGN
    # Note the \@child_ids this puts a reference to @child_ids in the data 
    # so that @child_ids isn't flattened, which was what was causing your 
    # original bug. Also note that this only works because you are creating a
    # new version of @child_ids with each iteration of the loop if you moved the
    # my @child_ids outside the loop then the assignment will be assigning a 
    # reference to the same variable each time through the loop and each record 
    # will end up with the last entry of field 6 from the file

    $spr_hash{$fields[0]} = [$fields[1], $fields[2], $fields[3], $fields[4],
        $fields[5], \@child_ids, $fields[7], $fields[8]];
}

答案 1 :(得分:0)

尚不清楚您是否想要在Perl中编写的行为。它将$children中的字母删除,然后尝试将其拆分,但是您所需的输出显示['c1', 'c2'],这种方式无法实现

如果您只想从第七个字段开始输入数字序列,那么使用全局正则表达式会更容易。我会这样做;它只是将该字段替换为就地数组引用。如果您还想要其他东西,请这么说

并非$data是文件句柄的坏名字,并且%spr_hash是不必要的,因为%已经表明哈希值

此外,不要在文件顶部声明所有变量。它使它们成为全局变量,而全局变量是 Bad Thing

use strict;
use warnings 'all';

use Data::Dumper;

# skipping header
<$fh>;

my %spr;

while ( <$fh> ) {
    chomp;
    my ($key, @fields) = split /;/;
    $fields[5] = [ $fields[5] =~ /\d+/g ];
    $spr{$key} = \@fields;
 }

 print Dumper \%spr;

答案 2 :(得分:-1)

替换

$spr_hash{$fields[0]} = [$fields[1], $fields[2], $fields[3], $fields[4], $fields[5], @child_ids, $fields[7], $fields[8]];

作者

$spr_hash{$fields[0]} = [$fields[1], $fields[2], $fields[3], $fields[4], $fields[5], [ @child_ids ], $fields[7], $fields[8]];

在Perl中,列表被展平,因此当您在带有@child_ids的新数组引用中使用[ ]时,就会松散它是一个数组的事实,其内容将被复制到新的巨大数组中参考,仅此而已。

如果您将其包含在[ ]中,则会创建一个新数组 reference ,然后将对数组的引用(而不是其内容)推入该数组,这将产生您的第二个示例,除非其他逻辑错误。

请注意,当@child_ids为空时,这会将[](对空数组的引用)添加到$spr_hash{$fields[0]}中。这可能是您想要的,也可能不是。

也与您的问题无关,但我建议在这种情况下不要使用数组,每个项目的确是具有语义的单独元素,因此类似的做法可能更有意义:

$spr_hash{$fields[0] = {
    date => $fields[1],
    whatever_is_p1 => $fields[2],
    owner => $fields[3],

etc...
}

看看这两个教程以供进一步参考:

请参阅数组切片,以使您的构造在编写时更小。