编辑:对不起,当我的意思是'参考'并且我已经包含完整的属性时,我错误地输入'name'
我有一些xml文件,它们在一行中包含一个完整的xml文档。一个例子是:
<Reqeusts>
<WRRequest><Request domain="foo.com"><Rows><Row includeascolumn="n" interval="hour" ref="time" type="group"/><Row includeascolumn="n" ref="domain_id" type="group"/><Row />...</Rows><Columns><Column ref="user_id"/><Column ref="country_id"/><Column ref="country_name"/>...</Columns></Request></WRRequest>
.
.
.
</Requests>
为了清晰起见,我还没有包含许多属性。
我正在使用XML :: Parser&amp; XML :: SimpleObject可以正常工作。例如,我只是打印出每个元素的属性,除非我尝试打印出列元素的'ref'属性。然后我得到一个“未初始化的变量”错误。代码是:
#!/usr/bin/perl
use warnings;
use diagnostics;
use XML::Parser;
use XML::SimpleObject;
use Cwd;
if ($ARGV[0] eq "") {
die "usage: sumXML.pl <input file> \n";
}
my $fileName = $ARGV[0];
my $parser = new XML::Parser(Style => 'Tree');
my $xso = XML::SimpleObject->new( $parser->parsefile("$fileName") );
foreach my $wrRequest ($xso->child('WRRequests')->children('RWRequest')) {
print "Client Name: " . $wrRequest->attribute('clientName') . "\n";
foreach my $xmlRequest ($wrRequest->child('REQUEST')) {
print "Domain name: " . $xmlRequest->attribute('domain') . "\n";
print "Service: " . $xmlRequest->attribute('service') . "\n";
foreach my $xmlRow ($xmlRequest->child('ROWS')->children('ROW')) {
print "Row Reference: " . $xmlRow->attribute('ref') . "\n";
}
foreach my $xmlColumn ($xmlRequest->child('COLUMNS')->children('COLUMN')) {
print "Column Reference: " . $xmlColumn->attribute('ref') . "\n";
}
}
print "\n";
}
答案 0 :(得分:1)
您的示例数据不会解析(即使您删除了点),因此它不是有效的XML。我不确定您的实际数据是什么样的,但这对于找到问题非常重要。
我确定XML::Parser
或XML::SimpleObject
没有任何问题。所以请检查以下内容:
REQUEST
- 元素是否都有service
- 属性?每个ROW
都有ref
- 属性吗? )。如果它们不存在,您必须拒绝输入数据或处理您拥有的数据。这当然取决于您的要求。我实际上已经花时间让它工作了(只需更改元素名称的大小写,并稍微修改“示例数据”):
use strict;
use warnings;
use XML::Parser;
use XML::SimpleObject;
use Cwd;
my $inXML = join "", <DATA>;
print $inXML;
my $parser = new XML::Parser(Style => 'Tree');
my $xso = XML::SimpleObject->new( $parser->parse($inXML) );
foreach my $wrRequest ($xso->child('Requests')->children('WRRequest')) {
print "Client Name: " . $wrRequest->attribute('clientName') . "\n";
foreach my $xmlRequest ($wrRequest->child('Request')) {
print "Domain name: " . $xmlRequest->attribute('domain') . "\n";
print "Service: " . $xmlRequest->attribute('service') . "\n";
foreach my $xmlRow ($xmlRequest->child('Rows')->children('Row')) {
print "Row Reference: " . $xmlRow->attribute('ref') . "\n";
}
foreach my $xmlColumn ($xmlRequest->child('Columns')->children('Column')) {
print "Column Reference: " . $xmlColumn->attribute('ref') . "\n";
}
}
print "\n";
}
__DATA__
<Requests>
<WRRequest clientName="foo">
<Request service="fooService" domain="foo.com">
<Rows>
<Row includeascolumn="n" interval="hour" ref="time" type="group"/>
<Row includeascolumn="n" ref="domain_id" type="group"/>
</Rows>
<Columns>
<Column ref="user_id"/>
<Column ref="country_id"/>
<Column ref="country_name"/>
</Columns>
</Request>
</WRRequest>
</Requests>
输出:
Client Name: foo
Domain name: foo.com
Service: fooService
Row Reference: time
Row Reference: domain_id
Column Reference: user_id
Column Reference: country_id
Column Reference: country_name
我已经使用多个WRRequest
元素进行了测试 - 元素(复制和粘贴) - 像魅力一样工作。
答案 1 :(得分:1)
我无法确定数据应该如何真正理想地组织,但我发现XML::Rules在这些情况下很方便。如果您对完全不同的方式持开放态度,例如: (我假设'ref'是每一行的关键,列名应保持顺序,你关心的只是'ref'属性等):
use strict;
use warnings;
use Data::Dumper;
use XML::Rules;
my $xml = <<XML;
<Requests>
<WRRequest>
<Request domain="foo.com" service="SomeService">
<Rows>
<Row includeascolumn="n" interval="hour" ref="time" type="group"/>
<Row includeascolumn="n" ref="domain_id" type="group"/>
</Rows>
<Columns>
<Column ref="user_id"/>
<Column ref="country_id"/>
<Column ref="country_name"/>
</Columns>
</Request>
</WRRequest>
</Requests>
XML
my @rules = (
Request => sub { delete $_[1]->{_content}; print Dumper $_[1]; return },
Rows => 'pass no content',
Columns => 'pass no content',
Row => 'no content by ref',
Column => sub { '@'.$_[0] => $_[1]{ref} },
);
my $p = XML::Rules->new(
rules => \@rules,
);
$p->parse($xml);
__END__
$VAR1 = {
'Column' => [
'user_id',
'country_id',
'country_name'
],
'domain' => 'foo.com',
'time' => {
'type' => 'group',
'includeascolumn' => 'n',
'interval' => 'hour'
},
'domain_id' => {
'type' => 'group',
'includeascolumn' => 'n'
},
'service' => 'SomeService'
};