我有一个如下所示的XML文件,
<?xml version="1.0"?>
<data>
<header>
<name>V9 Red Indices</name>
<version>9</version>
<date>2017-03-16</date>
</header>
<index>
<indexfamily>ITRAXX-Asian</indexfamily>
<indexsubfamily>iTraxx Rest of Asia</indexsubfamily>
<paymentfrequency>3M</paymentfrequency>
<recoveryrate>0.35</recoveryrate>
<constituents>
<constituent>
<refentity>
<originalconstituent>
<referenceentity>ICICI Bank Limited</referenceentity>
<redentitycode>Y1BDCC</redentitycode>
<role>Issuer</role>
<redpaircode>Y1BDCCAA9</redpaircode>
<jurisdiction>India</jurisdiction>
<tier>SNRFOR</tier>
<pairiscurrent>false</pairiscurrent>
<pairvalidfrom>2002-03-30</pairvalidfrom>
<pairvalidto>2008-10-22</pairvalidto>
<ticker>ICICIB</ticker>
<ispreferred>false</ispreferred>
<docclause>CR</docclause>
<recorddate>2014-02-25</recorddate>
<weight>0.0769</weight>
</originalconstituent>
</refentity>
<refobligation>
<type>Bond</type>
<isconvert>false</isconvert>
<isperp>false</isperp>
<coupontype>Fixed</coupontype>
<ccy>USD</ccy>
<maturity>2008-10-22</maturity>
<coupon>0.0475</coupon>
<isin>XS0178885876</isin>
<cusip>Y38575AQ2</cusip>
<event>Matured</event>
<obligationname>ICICIB 4.75 22Oct08</obligationname>
<prospectusinfo>
<issuers>
<origissuersasperprosp>ICICI Bank Limited</origissuersasperprosp>
</issuers>
</prospectusinfo>
</refobligation>
</constituent>
</constituents>
</index>
</data>
我想在不知道标签名称的情况下迭代这个文件。我的最终目标是创建一个带有标签名称和值的哈希。
我不想对每个节点使用findnodes
和XPath。这违背了编写通用加载程序的整个目的。
我也在使用XML-LibXML-2.0126,这是一个旧版本。
我的部分代码使用findnodes
如下。 XML也缩短了,以避免现在变成的冗长查询:)
use XML::LibXML;
my $xmldoc = $parser->parse_file( $fileName );
my $root = $xmldoc->getDocumentElement() || die( "Could not get Document Element \n" );
foreach my $index ( $root->findnodes( "index" ) ) { # $root->getChildNodes()) # Get all the Indexes
foreach my $constituent ( $index->findnodes( 'constituents/constituent' ) ) { # Lets pick up all Constituents
my $referenceentity = $constituent->findnodes( 'refentity/originalconstituent/referenceentity' ); # This is a crude way. we should be iterating without knowing whats inside
print "referenceentity :" . $referenceentity . "\n";
print "+++++++++++++++++++++++++++++++++++ \n";
}
}
答案 0 :(得分:1)
使用XML::LibXML::Node
提供的nonBlankChildNodes
,nodeName
和textContent
方法:
my %hash;
for my $node ( $oc->nonBlankChildNodes ) {
my $tag = $node->nodeName;
my $value = $node->textContent;
$hash{$tag} = $value;
}
相当于:
my %hash = map { $_->nodeName, $_->textContent } $oc->nonBlankChildNodes;
答案 1 :(得分:0)
XML::LibXML::Document
对象访问任意数据同样简单,因为它来自嵌套的Perl哈希。它肯定会比同等对象占用更少的内存空间,如果这是你的意图,但从你的问题来看它并不是这样的
您可以使用XML::Parser
模块轻松完成此操作,该模块每次发生&#34;事件时都会调用回调。发生在XML数据中。在这种情况下,我们感兴趣的是开放标记,关闭标记和文本字符串
此示例代码从XML构建嵌套哈希。如果XML数据格式不正确(结束标记与开始标记的名称不匹配)或者任何元素具有一个或多个属性,它就会以适当的消息消失,这些属性无法表示这个结构
我已使用Data::Dump
显示结果
use strict;
use warnings 'all';
use XML::Parser;
use Data::Dump;
my $parser = XML::Parser->new(
Style => 'Debug',
Handlers => {
Start => \&handle_start,
End => \&handle_end,
Char => \&handle_char,
},
);
my %data;
my @data_stack = ( \%data );
my @elem_stack;
$parser->parsefile( 'index.xml' );
dd \%data;
sub handle_start {
my ($expat, $elem) = @_;
my $data = $data_stack[-1]{$elem} = { };
push @data_stack, $data;
push @elem_stack, $elem;
if ( @_ > 2 ) {
my $xpath = join '', map "/$_", @elem_stack;
die qq{Element at $xpath has attributes};
}
}
sub handle_end {
my ($expat, $elem) = @_;
my $top_elem = pop @elem_stack;
die qq{Bad XML structure $elem <=> $top_elem} unless $elem eq $top_elem;
pop @data_stack;
}
sub handle_char {
my ($expat, $str) = @_;
return unless $str =~ /\S/;
my $top_elem = $elem_stack[-1];
$data_stack[-2]{$top_elem} = $str;
}
{
data => {
header => {
date => "2017-03-16",
name => "V9 Red Indices",
version => 9,
},
index => {
constituents => {
constituent => {
refentity => {
originalconstituent => {
docclause => "CR",
ispreferred => "false",
jurisdiction => "India",
pairiscurrent => "false",
pairvalidfrom => "2002-03-30",
pairvalidto => "2008-10-22",
recorddate => "2014-02-25",
redentitycode => "Y1BDCC",
redpaircode => "Y1BDCCAA9",
referenceentity => "ICICI Bank Limited",
role => "Issuer",
ticker => "ICICIB",
tier => "SNRFOR",
weight => 0.0769,
},
},
refobligation => {
ccy => "USD",
coupon => 0.0475,
coupontype => "Fixed",
cusip => "Y38575AQ2",
event => "Matured",
isconvert => "false",
isin => "XS0178885876",
isperp => "false",
maturity => "2008-10-22",
obligationname => "ICICIB 4.75 22Oct08",
prospectusinfo => {
issuers => {
origissuersasperprosp => "ICICI Bank Limited"
},
},
type => "Bond",
},
},
},
indexfamily => "ITRAXX-Asian",
indexsubfamily => "iTraxx Rest of Asia",
paymentfrequency => "3M",
recoveryrate => 0.35,
},
},
}