如何使用Perl计算XML文件中暂停的值

时间:2014-04-05 07:46:22

标签: xml perl

我有一个包含以下信息的XML文件:

<Key Time="54288" Type="insert" Value="E" />
<Key Time="55288" Type="insert" Value="A" />
<Key Time="58298" Type="insert" Value="H" />
<Key Time="58398" Type="insert" Value="A" />
<Key Time="58498" Type="insert" Value="L" />
<Key Time="59298" Type="insert" Value="L" />    
<Key Time="64298" Type="insert" Value="O" />

我首先需要计算每个Key条目之间的总暂停持续时间,但仅限于暂停(自上一个Time以来的间隔)等于或高于2400时。

为此我得到了下面的脚本,它也显示了暂停开始的时间。

perl -nle '
   /<Key +Time\s*=\s*"([0-9]+)\s*"/ and push @nums,$1; 
   END{ 
       for(1..$#nums){ 
           $pause=$nums[$_]-$nums[$_-1];
           $pause >=2400 ? print "$pause started at ".$nums[$_-1] : ()
       }
   }' your_file_here > output_file

此输出

3010 started at 55288
5000 started at 59298

但是,现在我需要改进脚本以检索两个&gt; = 2400-long暂停之间的所有值,还包括暂停开始的值。例如,从Time="54288"Time="55288"我有EA;从Time="58298"Time="59298"我有HALL等。

2 个答案:

答案 0 :(得分:1)

这就是我想要的想法,即生成Value属性的列表,这些属性的间隔大于40分钟。

我使用了正确的XML解析器模块XML::Twig来执行此操作。使用正则表达式解析XML会遇到麻烦。

use strict;
use warnings;

use XML::Twig;

my @nums;
my $start_time;
my @blocks = ( '' );

my $twig = XML::Twig->new(
   twig_handlers => { Key => \&key_handler }
);
$twig->parse(*DATA);

print "$_\n" for @blocks;

sub key_handler {
  my ($twig, $key) = @_;
  my $time = $key->{att}{Time};

  if (defined $start_time) {
    my $pause = $time - $start_time; 
    push @blocks, ("$pause from $start_time to $time", '') if $pause >= 2400;
  }

  $start_time = $time;
  $blocks[-1] .= $key->{att}{Value};
}

__DATA__
<root>
  <Key Time="54288" Type="insert" Value="E" />
  <Key Time="55288" Type="insert" Value="A" />
  <Key Time="58298" Type="insert" Value="H" />
  <Key Time="58398" Type="insert" Value="A" />
  <Key Time="58498" Type="insert" Value="L" />
  <Key Time="59298" Type="insert" Value="L" />    
  <Key Time="64298" Type="insert" Value="O" />
</root>

<强>输出

EA
3010 from 55288 to 58298
HALL
5000 from 59298 to 64298
O

答案 1 :(得分:0)

建模Borodin的解决方案,但改为使用XML::LibXML

use strict;
use warnings;

use XML::LibXML;

my $string = do {local $/; <DATA>};

my $dom = XML::LibXML->load_xml(string => $string);

my @blocks = '';

my $lasttime;
for my $node ($dom->findnodes('//Key')) {
    my $time = $node->getAttribute('Time');

    if (defined $lasttime) {
        my $pause = $time - $lasttime;
        push @blocks, "pause from $lasttime to $time", '' if $pause >= 2400;
    }
    $blocks[-1] .= $node->getAttribute('Value');
    $lasttime = $time;
}

print "$_\n" for @blocks;

__DATA__
<root>
  <Key Time="54288" Type="insert" Value="E" />
  <Key Time="55288" Type="insert" Value="A" />
  <Key Time="58298" Type="insert" Value="H" />
  <Key Time="58398" Type="insert" Value="A" />
  <Key Time="58498" Type="insert" Value="L" />
  <Key Time="59298" Type="insert" Value="L" />    
  <Key Time="64298" Type="insert" Value="O" />
</root>