我有一个包含以下信息的XML文件:
<Key Time="54288" Type="insert" Value="E" />
<Key Time="55288" Type="insert" Value="A" />
<Key Time="58298" Type="insert" Value="H" />
<Key Time="58398" Type="insert" Value="A" />
<Key Time="58498" Type="insert" Value="L" />
<Key Time="59298" Type="insert" Value="L" />
<Key Time="64298" Type="insert" Value="O" />
我首先需要计算每个Key
条目之间的总暂停持续时间,但仅限于暂停(自上一个Time
以来的间隔)等于或高于2400时。
为此我得到了下面的脚本,它也显示了暂停开始的时间。
perl -nle '
/<Key +Time\s*=\s*"([0-9]+)\s*"/ and push @nums,$1;
END{
for(1..$#nums){
$pause=$nums[$_]-$nums[$_-1];
$pause >=2400 ? print "$pause started at ".$nums[$_-1] : ()
}
}' your_file_here > output_file
此输出
3010 started at 55288
5000 started at 59298
但是,现在我需要改进脚本以检索两个&gt; = 2400-long暂停之间的所有值,还包括暂停开始的值。例如,从Time="54288"
到Time="55288"
我有EA
;从Time="58298"
到Time="59298"
我有HALL
等。
答案 0 :(得分:1)
这就是我想要的想法,即生成Value
属性的列表,这些属性的间隔大于40分钟。
我使用了正确的XML解析器模块XML::Twig
来执行此操作。使用正则表达式解析XML会遇到麻烦。
use strict;
use warnings;
use XML::Twig;
my @nums;
my $start_time;
my @blocks = ( '' );
my $twig = XML::Twig->new(
twig_handlers => { Key => \&key_handler }
);
$twig->parse(*DATA);
print "$_\n" for @blocks;
sub key_handler {
my ($twig, $key) = @_;
my $time = $key->{att}{Time};
if (defined $start_time) {
my $pause = $time - $start_time;
push @blocks, ("$pause from $start_time to $time", '') if $pause >= 2400;
}
$start_time = $time;
$blocks[-1] .= $key->{att}{Value};
}
__DATA__
<root>
<Key Time="54288" Type="insert" Value="E" />
<Key Time="55288" Type="insert" Value="A" />
<Key Time="58298" Type="insert" Value="H" />
<Key Time="58398" Type="insert" Value="A" />
<Key Time="58498" Type="insert" Value="L" />
<Key Time="59298" Type="insert" Value="L" />
<Key Time="64298" Type="insert" Value="O" />
</root>
<强>输出强>
EA
3010 from 55288 to 58298
HALL
5000 from 59298 to 64298
O
答案 1 :(得分:0)
建模Borodin的解决方案,但改为使用XML::LibXML
:
use strict;
use warnings;
use XML::LibXML;
my $string = do {local $/; <DATA>};
my $dom = XML::LibXML->load_xml(string => $string);
my @blocks = '';
my $lasttime;
for my $node ($dom->findnodes('//Key')) {
my $time = $node->getAttribute('Time');
if (defined $lasttime) {
my $pause = $time - $lasttime;
push @blocks, "pause from $lasttime to $time", '' if $pause >= 2400;
}
$blocks[-1] .= $node->getAttribute('Value');
$lasttime = $time;
}
print "$_\n" for @blocks;
__DATA__
<root>
<Key Time="54288" Type="insert" Value="E" />
<Key Time="55288" Type="insert" Value="A" />
<Key Time="58298" Type="insert" Value="H" />
<Key Time="58398" Type="insert" Value="A" />
<Key Time="58498" Type="insert" Value="L" />
<Key Time="59298" Type="insert" Value="L" />
<Key Time="64298" Type="insert" Value="O" />
</root>