如何从perl中的xml获取所需的元素?

时间:2014-05-19 13:00:22

标签: xml perl

我想在perl中使用一个方法来执行以下操作 示例xml文件

<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/> First Country
    </country>
    <country name="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/> Second Country
    </country>
    <country name="Panama">
        <rank>68</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/> Third Country
    </country>
</data>

当我输入elment rank作为输入时,输出应如下所示。

<?xml version="1.0"?>
    <data>
        <country name="Liechtenstein">
            <rank>1</rank>
        </country>
        <country name="Singapore">
            <rank>4</rank>
        </country>
        <country name="Panama">
            <rank>68</rank>
        </country>
    </data>

3 个答案:

答案 0 :(得分:3)

此解决方案使用XML::LibXML,并通过查找具有所需节点名称的每个country元素的子项,然后删除所有子项并添加所选元素来工作。

use strict;
use warnings;

use XML::LibXML;

my $doc = XML::LibXML->load_xml(IO => *DATA, no_blanks => 1);

my $nodename = 'rank';

for my $country ($doc->findnodes('/data/country')) {
  my ($node) = $country->findnodes($nodename);
  $country->removeChildNodes;
  $country->appendChild($node);
}

print $doc->toString(1);

__DATA__
<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/> First Country
    </country>
    <country name="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/> Second Country
    </country>
    <country name="Panama">
        <rank>68</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/> Third Country
    </country>
</data>

<强>输出

<?xml version="1.0"?>
<data>
  <country name="Liechtenstein">
    <rank>1</rank>
  </country>
  <country name="Singapore">
    <rank>4</rank>
  </country>
  <country name="Panama">
    <rank>68</rank>
  </country>
</data>

答案 1 :(得分:2)

使用XML :: Twig:

#!/usr/bin/perl

use strict;
use warnings;

use XML::Twig;

my $tag= shift @ARGV;
my $xml_file= shift @ARGV;

XML::Twig->new( twig_handlers => 
                  { 'country' => sub { foreach my $c ($_->children) 
                                         { $c->delete unless $c->is( $tag); } 
                                     },
                  },
                 pretty_print => 'indented',
              )->parsefile( $xml_file )
               ->print;

答案 2 :(得分:1)

您可以使用Perl + XSLT来完成。首先,您需要一个XSLT文档。下面的那个进行了您需要的转换(您可以测试它here):

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:output indent="yes"/>
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"></xsl:apply-templates>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="year|gdppc|neighbor|country/text()" />
</xsl:stylesheet>

XSLT有很多库(搜索CPAN或检查XML::LibXSLT,这是最受欢迎的库)。有关两种选择,请参阅this question中的第一个答案。第二个,使用XML::LibXSLT::Easy非常简单,可能就是您所需要的:

use XML::LibXSLT::Easy;
my $p = XML::LibXSLT::Easy->new;
my $output = $p->process( xml => "data.xml", xsl => "stylesheet.xsl" );

这种转变的结果是:

<?xml version="1.0" encoding="UTF-8"?>
<data>
    <country name="Liechtenstein">
      <rank>1</rank>
   </country>
    <country name="Singapore">
      <rank>4</rank>
   </country>
    <country name="Panama">
      <rank>68</rank>
   </country>
</data>