Perl XML :: LibXML,findnodes只能读取XML文件的根目录

时间:2015-12-27 21:01:32

标签: xml perl nodes xml-libxml

我正在尝试解析此.kml文件:

<?xml version="1.0" encoding="utf-8" ?>
<kml xmlns="http://www.opengis.net/kml/2.2">
<Document id="root_doc">
<Schema name="PostalCodeCanada" id="PostalCodeCanada">
    <SimpleField name="ZIP" type="string"></SimpleField>
    <SimpleField name="VERTICES" type="int"></SimpleField>
</Schema>
<Folder><name>PostalCodeCanada</name>
  <Placemark>
    <Style><LineStyle><color>ff0000ff</color></LineStyle><PolyStyle><fill>0</fill></PolyStyle></Style>
    <ExtendedData><SchemaData schemaUrl="#PostalCodeCanada">
        <SimpleData name="ZIP">G1Y1B1</SimpleData>
        <SimpleData name="VERTICES">5</SimpleData>
    </SchemaData></ExtendedData>
      <Polygon><altitudeMode>relativeToGround</altitudeMode><outerBoundaryIs><LinearRing><altitudeMode>relativeToGround</altitudeMode><coordinates>-73.604399,45.545611 -73.603988,45.545886 -73.602861,45.547715 -73.602861,45.547715 -73.604399,45.545611 -73.604399,45.545611</coordinates></LinearRing></outerBoundaryIs></Polygon>
  </Placemark>
  <Placemark>
    <Style><LineStyle><color>ff0000ff</color></LineStyle><PolyStyle><fill>0</fill></PolyStyle></Style>
    <ExtendedData><SchemaData schemaUrl="#PostalCodeCanada">
        <SimpleData name="ZIP">G1Y1B2</SimpleData>
        <SimpleData name="VERTICES">5</SimpleData>
    </SchemaData></ExtendedData>
      <Polygon><altitudeMode>relativeToGround</altitudeMode><outerBoundaryIs><LinearRing><altitudeMode>relativeToGround</altitudeMode><coordinates>-73.604399,45.545611 -73.603988,45.545886 -73.602861,45.547715 -73.602861,45.547715 -73.604399,45.545611 -73.604399,45.545611</coordinates></LinearRing></outerBoundaryIs></Polygon>
  </Placemark>
</Folder>
</Document></kml>

我正在使用带有XML :: LibXML的Perl,但findnodes无法读取除“/”之外的任何节点。这是我的代码:

#!/usr/bin/env perl

use XML::LibXML;
use strict;
use warnings;

my $outputFilename = "PostalCodesCollegePro.kml";

my $intro = '<?xml version="1.0" encoding="utf-8" ?>'."\n".'<kml xmlns="http://www.opengis.net/kml/2.2">'."\n".'<Document id="root_doc">'."\n".'<Schema name="PostalCodeCanada" id="PostalCodeCanada">'."\n\t".'<SimpleField name="ZIP" type="string"></SimpleField>'."\n\t".'<SimpleField name="VERTICES" type="int"></SimpleField>'."\n".'</Schema>'."\n".'<Folder><name>PostalCodeCanada</name>'."\n";
my $outro = '</Folder>'."\n".'</Document></kml>'."\n";

open (my $fh, ">".$outputFilename) or die "Impossible d'ouvrir le fichier d'écriture";
print $fh $intro;

my $xml = XML::LibXML->new();
my $data = $xml->parse_file("PostalCodeCanada.kml");
foreach my $node ( $data->findnodes('//Folder') ) {
    print ($node->toString);
#   my($zip) = $node->findnodes('./ExtendedData/SchemaData/SimpleData');
#   print ($zip->to_literal."\n");
#   if ($zip->to_literal =~ /(^G1Y)|(^G3A)|(^G2G)|(^G3L)|(^G3H)|G0A2R0|G0A1T0|G0A1L0|G0A3H0|G0A3G0|G0A2Y0|G0A2Z0|G0A4N0|G0A2J0|G0A3M0|G0A4A0|G0A1A0|G0A1Y0|G0A1S0|G0A4B0|G0A3T0|G0A3B0|G0A4H0|G0A1W0|G0A3L0|G0A4L0|G0A3A0/){
#       print $fh $node->to_literal;
#   }
}

print $fh $outro; 
close $fh or warn "Impossible de fermer le fichier après écriture";`

感谢大家提供一些帮助! PS:这是一个缩小的.kml文件,实际上真实的文件包含所有加拿大邮政编码的所有地理信息。我想生成另一个只包含所需邮政编码的.kml,以便使用Google Map API生成地图。

3 个答案:

答案 0 :(得分:1)

您的问题是您的节点都在命名空间内,因此您需要处理它。最简单的方法可能是使用XML::LibXML::XPathContext对象。

my $xml = XML::LibXML->new();
my $data = $xml->parse_file("PostalCodeCanada.kml");

my $xpc = XML::LibXML::XPathContext->new($data);
$xpc->registerNs('k', 'http://www.opengis.net/kml/2.2');

foreach my $node ( $xpc->findnodes('//k:Folder') ) {
  ...
}

答案 1 :(得分:1)

您的XML数据使用默认命名空间,必须在使用XPath访问它时明确指定。涉及XML::LibXML的地方,这意味着您必须创建XML::LibXML::XPathContext对象来搜索数据

这是一个满足您需求的示例程序

#!/usr/bin/env perl

use strict;
use warnings;

use XML::LibXML;

my $doc = XML::LibXML->load_xml(location => 'PostalCodeCanada.kml');
my $xpc = XML::LibXML::XPathContext->new($doc);
$xpc->registerNs( gis => 'http://www.opengis.net/kml/2.2');

for my $folder ( $xpc->findnodes('/gis:kml/gis:Document/gis:Folder') ) {

    my ($zip) = $xpc->findnodes('gis:Placemark/gis:ExtendedData/gis:SchemaData/gis:SimpleData', $folder);
    $zip = $zip->to_literal;

    print "$zip\n";

    if ( $zip =~ /(?:G0A(?:1A0|1L0|1S0|1T0|1W0|1Y0|2J0|2R0|2Y0|2Z0|3A0|3B0|3G0|3H0|3L0|3M0|3T0|4A0|4B0|4H0|4L0|4N0)|G1Y|G1Y1B1|G2G|G3A|G3H|G3L)/){
        print $folder->to_literal;
    }
}

答案 2 :(得分:0)

您已在XML::LibXML内得到答案。但是我会指出 - 如果你使用XML::Twig,你可以忽略命名空间。 (那是因为它并不真正支持他们 - 如果你只有一个人就可以了!)

#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;

my $twig = XML::Twig -> new -> parsefile ( 'input.kml');
foreach my $node ( $twig -> findnodes ( '//Folder') ) {
   print $node -> text,"\n";
}