Question

我正在试图弄清楚是否有一种确定XML节点的xpath的好方法。

目前，我正在这样做：

#!/usr/bin/env perl

use strict;
use warnings;

use XML::Twig;

my $twig = XML::Twig->parse( \*DATA );
print $twig ->get_xpath( '/root/fish/carrot[@colour="orange"]/pie', 0 )->text,
    "\n";

foreach my $node ( $twig->get_xpath('//*') ) {
    my @path_tags;
    my @path_with_att;
    my $cursor = $node;
    while ($cursor) {
        unshift( @path_tags, $cursor->tag );

        my $att_path = "";
        if ( $cursor->atts ) {
            $att_path = join( "",
                map { "[@" . $_ . "=\"" . $cursor->att($_) . "\"]" }
                    keys %{ $cursor->atts } );
        }
        unshift( @path_with_att, $cursor->tag . $att_path );
        $cursor = $cursor->parent;
    }

    print join( "/", @path_tags ), "\n";

    my $xpath_with_atts = "/" . join( "/", @path_with_att );
    print $xpath_with_atts, "\n";
    print "Found:", $twig->get_xpath( $xpath_with_atts, 0 )->tag, "\n";
}

__DATA__
<root>
   <fish skin="scaly" home="pond">
      <carrot colour="orange">
          <pie>This value</pie>
      </carrot>
   </fish>
</root>

我正在遍历这个结构（使用通配符xpath，或许有点讽刺 - 但重点是我希望能够在例如twig处理程序中执行此操作）。

然后以递归方式遍历 up 树，找出当前节点{（1}}的两个变体（有和没有元素）。这当然是认识到xpath绝不是唯一的，因此可能会有重复（最后'找到'纯粹是一个验证步骤）。

但这是因为我无法在我倾向于支持的两个XML库中找到my-is-my-xpath方法（xpath和XML::Twig）。

所以我的问题是 - 我可以（并且应该）使用XML库中的内置机制吗？如果没有，实际上有什么理由不是吗？

我的意思是，我的上述示例有效，但我想知道是否存在一些细微差别，即这种方法（或类似的东西）对于整个XML规范都不可行。

Answer 1

XML :: LibXML有$node->nodePath()。

use strict;
use warnings;
use feature qw( say );

use XML::LibXML qw( );

my $xml = <<'__EOI__';
<root>
<fish>
<carrot colour="purple"><pie/></carrot>
<carrot colour="orange"><pie/></carrot>
<carrot colour="blue"><pie/></carrot>
</fish>
</root>
__EOI__

my $parser = XML::LibXML->new();
my $doc = $parser->parse_string($xml);
say $_->nodePath()
   for $doc->findnodes("//pie");

输出：

/root/fish/carrot[1]/pie
/root/fish/carrot[2]/pie
/root/fish/carrot[3]/pie

它使用位置而不是属性来识别歧义，因为属性可能无法唯一地标识元素。

请注意，如果在另一个文档中使用该路径可能会产生多个结果，原因是在原始文档中只有一个节点上缺少[1]。

至于为什么XML :: Twig没有，这不是一个非常有用的功能。如果您只使用一个文档，则您已经拥有对该节点的引用。如果你想制作一个适用于多个文档的路径，那么模块就不可能知道正确的路径应该是什么。例如，以下哪项是正确的？

/a/b
/a/b[1]
/a/b[@id="123"]
/a/b[@default="1"]
/a/b[@id="123" && @default="1"]

我如何计算perl中元素的xpath？

1 个答案: