我想用Perl解析这个xml。我在这里展示的XML只是更大和嵌套的XML的一部分。我尝试使用普通的解析器,其中大多数都以哈希格式提供输出,难以读取和访问子节点。
我想获取元素并读取所有属性值。
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<TR name="App.exe" total="573" errors="1" failures="2" not-run="4" inconclusive="2" ignored="4" skipped="0" invalid="0" date="2015-01-12" time="17:43:59">
<environment version="2" cversion="44" os-version="Microsoft" platform="Win32NT" cwd="" machine-name="" user="me" user-domain="domain" />
<culture-info current-culture="en-US" current-uiculture="en-US" />
<TS type="Assembly" name="App.exe" executed="True" result="Failure" success="False" time="22" asserts="0">
<RS>
<TS type="Namespace" name="MyAPP" executed="True" result="Failure" success="False" time="2335.164" asserts="0">
<RS>
<TS type="Namespace" name="Project" executed="True" result="Failure" success="False" time="2335.164" asserts="0">
<RS>
<TS type="Namespace" name="Website" executed="True" result="Failure" success="False" time="2335.164" asserts="0">
<RS>
<TS type="Namespace" name="Service" executed="True" result="Failure" success="False" time="2335.163" asserts="0">
<RS>
<TS type="SetUpFixture" name="Tests" executed="True" result="Failure" success="False" time="2335.163" asserts="0">
<RS>
<TS type="Namespace" name="tempt" executed="True" result="Success" success="True" time="8.935" asserts="0">
<RS>
<TS type="ParameterizedFixture" name="TempAPI" executed="True" result="Success" success="True" time="8.935" asserts="0">
<RS>
<TS type="TestFixture" name="Admin" executed="True" result="Success" success="True" time="3.306" asserts="2">
<RS>
<TC name="testName1" executed="True" result="Success" success="True" time="0.352" asserts="0" />
<TC name="testName2" executed="True" result="Success" success="True" time="0.005" asserts="0" />
</RS>
</TS>
<TS type="TestFixture" name="Client" executed="True" result="Success" success="True" time="2.620" asserts="1">
<RS>
<TC name="testName3" executed="True" result="Success" success="True" time="0.319" asserts="0" />
<TC name="testName4" executed="True" result="Success" success="True" time="0.000" asserts="0" />
</RS>
</TS>
<TS type="TestFixture" name="Employee" executed="True" result="Success" success="True" time="3.007" asserts="1">
<RS>
<TC name="testName5" executed="True" result="Success" success="True" time="0.290" asserts="0" />
<TC name="testName6" executed="True" result="Success" success="True" time="0.000" asserts="0" />
</RS>
</TS>
</RS>
</TS>
</RS>
</TS>
</RS>
</TS>
</RS>
</TS>
</RS>
</TS>
</RS>
</TS>
</RS>
</TS>
</RS>
</TS>
</TR>
我尝试过这样做,正如我所说的那样,它会提供难以阅读和获取细节的哈希输出。
my $list = XMLin('F:\Sample.xml', KeepRoot => 1);
#print $list-->{TS}[0]{name};
print Dumper($list );
write_file 'F:\mydump.log', Dumper($list);
我需要有关解析器的建议,该解析器可以输出比哈希更容易阅读的格式。
使用这个XML :: Simple我得到以下格式
$VAR1 = {
'TR' => {
'failures' => '2',
'TS' => {
'asserts' => '0',
'success' => 'False',
'time' => '22',
'name' => 'App.exe',
'executed' => 'True',
'type' => 'Assembly',
'RS' => {
'TS' => {
'asserts' => '0',
'success' => 'False',
'time' => '2335.164',
'name' => 'MyAPP',
'executed' => 'True',
'type' => 'Namespace',
'RS' => {
'TS' => {
'asserts' => '0',
'success' => 'False',
'time' => '2335.164',
'name' => 'Project',
'executed' => 'True',
'type' => 'Namespace',
'RS' => {
'TS' => {
'asserts' => '0',
'success' => 'False',
'time' => '2335.164',
'name' => 'Web',
'executed' => 'True',
'type' => 'Namespace',
'RS' => {
'TS' => {
'asserts' => '0',
'success' => 'False',
'time' => '2335.163',
'name' => 'Server',
'executed' => 'True',
'type' => 'Namespace',
'RS' => {
'TS' => {
'asserts' => '0',
'success' => 'False',
'time' => '2335.163',
'name' => 'Tests',
'Client' => {
'success' => 'True',
'asserts' => '1',
'time' => '2.620',
'executed' => 'True',
'type' => 'TestFixture',
'RS' => {
'TC' => {
'testName3' => {
'success' => 'True',
'asserts' => '0',
'time' => '0.319',
'executed' => 'True',
'result' => 'Success'
},
'testName4' => {
'success' => 'True',
'asserts' => '0',
'time' => '0.000',
'executed' => 'True',
'result' => 'Success'
}
}
},
'result' => 'Success'
},
'Admin' => {
'success' => 'True',
'asserts' => '2',
'time' => '3.306',
'executed' => 'True',
'type' => 'TestFixture',
'RS' => {
'TC' => {
'testName1' => {
'success' => 'True',
'asserts' => '0',
'time' => '0.352',
'executed' => 'True',
'result' => 'Success'
},
'testName2' => {
'success' => 'True',
'asserts' => '0',
'time' => '0.005',
'executed' => 'True',
'result' => 'Success'
}
}
},
'result' => 'Success'
}
}
},
'result' => 'Success'
}
},
'result' => 'Success'
}
},
'result' => 'Failure'
}
},
'result' => 'Failure'
}
},
'result' => 'Failure'
}
},
'result' => 'Failure'
}
},
'result' => 'Failure'
}
},
'result' => 'Failure'
},
'culture-info' => {
'current-culture' => 'en-US',
'current-uiculture' => 'en-US'
},
'errors' => '1',
'time' => '17:43:59',
'date' => '2015-01-12',
'not-run' => '4',
'name' => 'App.exe',
'ignored' => '4',
'total' => '573',
'skipped' => '0',
'environment' => {
'user-domain' => 'domain',
'nunit-version' => '2.6.3.13283',
'os-version' => 'Microsoft Windows NT 6.2.9200.0',
'cwd' => '',
'user' => 'me',
'platform' => 'Win32NT',
'clr-version' => '4.0.30319.34014',
'machine-name' => ''
},
'inconclusive' => '2',
'invalid' => '0'
}
};
答案 0 :(得分:4)
请勿使用XML::Simple。这是用词不当。它根本不简单,它适用于简单的XML。
不鼓励在新代码中使用此模块。
请尝试使用XML::Twig。
您的问题的一部分就是 - 您有一个深层嵌套的XML结构。 “展示”的方式有限。
但是每个 XML解析器的作用是 - 将您的XML转换为perl数据结构 - 通常是一个哈希。但它通常会做的是让你将结构重新打印成“正确的”XML。
因此,对于简单的重新格式化任务,XML :: Twig将允许您:
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
sub handle_tc {
my ( $twig, $tc ) = @_;
foreach my $attr ( keys %{ $tc -> atts() } ) {
print "$attr = ".$tc->att($attr)."\n";
}
print "\n";
}
my $twig_parser = XML::Twig->new(
pretty_print => 'indented',
twig_handlers => { 'TC' => \&handle_tc },
)->parsefile('F:\mydump.log');
print "\n\nWhole XML pretty_print\n\n";
$twig_parser->print;
这将 - 当它去 - 打印'TS'元素的每个'name'属性。每次解析器遇到TS
元素时,都会使用该XML子集调用该处理程序。
为了便于比较,$twig_parser -> print
将根据'pretty_print'选项重新格式化并输出。 (但是考虑到你的源XML,可能不会改变它)。
答案 1 :(得分:1)
根据评论,如果您只想要TC节点,您可以解析XML文件并迭代节点,如果节点标记为TC,则提取/打印所需的信息。
或者,您可以在读取文件时使用正则表达式来捕获TC节点,然后提取所需的信息。
使用XML Parsers获得的是你所倾倒的东西,这是你期望得到的,所以我不确定你到底想要什么。更平坦的结构没有嵌套?