从复杂的Perl数据结构中修剪空和单例arrayref

时间:2012-06-26 14:35:54

标签: perl data-structures perl-data-structures traversal

我正在尝试在Perl中添加一个从JSON读入的大型数据结构。两个陈规定型元素看起来像这样(在JSON中):

[
    [ [ {'payload':'test'} ], [ [ {'payload':'reply'} ], [] ] ],
    [ [ {'payload':'another thread'} ] 
]

我想在该元素的底部完全删除那个空的arrayref,并用包含的hashref替换每个只包含一个hashref的arrayref。换句话说,结果应该是这样的:

[
    [ {'payload':'test'}, [ {'payload':'reply'} ] ],
    [ {'payload':'another thread'} ]
]

目前我的代码如下:

use v5.12;
use strict;
use warnings;
use JSON::XS;
use Data::Walk;

sub cleanup {
    if (ref $_ eq 'ARRAY') {
        if (scalar(@{$_}) == 0) {
            die 'mysteriously I never reach this branch!';
            while (my ($key,$value) = each @{$Data::Walk::container}) {
                if ($value == $_) {
                    delete ${$Data::Walk::container}[$key]
                }
            }
        } elsif (scalar(@{$_}) == 1 and ref @{$_}[0]) {
            $_ = @{$_}[0];
        } else {
            my $tail = ${$_}[scalar(@{$_})-1];
            if (ref $tail eq 'ARRAY' and scalar(@{$tail}) == 0) {
                $#{$_}--;
            }
        }
    }
}

sub get {
    my $begin = shift;
    $begin = 0 unless $begin;
    my $end = shift();
    $end = $begin + 25 unless $end;
    my $threads;
    {
        local $/;
        open(my $f, '<emails.json');
        $threads = decode_json <$f>;
        close($f);
    }
    $threads = [ @{$threads}[$begin .. $end] ];
    walkdepth(\&eliminate_singleton, $threads);
    return $threads;
}

print JSON::XS->new->ascii->pretty->encode(&get('subject:joke'));

虽然它成功删除了空的arrayref,但它无法折叠单例。如何纠正这些代码,以致它可以摧毁单身人士?

1 个答案:

答案 0 :(得分:0)

我看到你要删除作为数组元素的空数组,但我不明白通过对其元素的引用替换每个单例arrayref。您是否可能意味着用其内容替换单个元素数组的每个哈希值?

所以

[
  "data1",
  [],
  "data3",
]

转换为

[
  "data1",
  "data3",
]

{
  "key1" : ["val1", "val2"],
  "key2" : ["val3"],
  "key3" : ["val4", "val5"],
}

转换为

{
  "key1" : ["val1", "val2"],
  "key2" : "val3",
  "key3" : ["val4", "val5"],
}

在您的计划中,后者对应于"tags" : ["inbox"]成为"tags" : "inbox"

如果是这种情况,则此版本的eliminate_singleton可以满足您的需求。

它从容器节点获取视图,并检查是否需要修改内部任何内容。从节点本身的角度来看,可能会导致节点在被扫描时被修改,从而破坏程序。实际上,从数组末尾向后循环是安全的,因为它不会删除任何未访问的节点。

use Scalar::Util 'reftype';

sub eliminate_singleton {

  my $node = $_;
  my $type = reftype $node // '';

  if ($type eq 'ARRAY') {
    for (my $i = $#$node; $i >= 0; $i--) {
      my $subnode = $node->[$i];
      my $subtype = reftype($subnode) // '';
      delete $node->[$i] if $subtype eq 'ARRAY' and @$subnode == 0;
    }
  }
  elsif ($type eq 'HASH') {
    for my $k (keys %$node) {
      my $subnode = $node->{$k};
      my $subtype = reftype($subnode) // '';
      if ($subtype eq 'ARRAY' and @$subnode == 1) {
        $node->{$k} = $node->{$k}[0];
      };
    }
  }
}