在两个节点之间获取xpath的最简单方法

时间:2014-02-14 00:21:36

标签: ruby nokogiri

使用nokogiri,只需调用node.path即可轻松获得从任何节点返回根节点的绝对路径。举个例子:

<bookstore>
  <department category="COOKING">
    <book>
      <title lang="en">Everyday Italian</title>
      <author>Giada De Laurentiis</author>
      <year>2005</year>
      <price>30.00</price>
    </book>
    <book>
      <title lang="en">Nice meals</title>
      <author>J K. Rowling</author>
      <year>2005</year>
      <price>29.99</price>
    </book>
  </department>
  <department category="WEB">
    <book>
      <title lang="en">Learning XML</title>
      <author>Erik T. Ray</author>
      <year>2003</year>
      <price>39.95</price>
    </book>
  </department>
</bookstore>

如果我做了

tree.search("//title[text() = 'Learning XML']").first.path

我会得到类似bookstore/department[2]/book[1]/title[1]

的内容

现在如果我想获得这个节点的路径,但是不是从根节点,我想从例如//department[@category='WEB']并一直到同一个标题节点?

换句话说。如何在//department[@category='WEB']bookstore/department[2]/book[1]/title[1]两个已知节点之间一般获取/生成路径?

修改

我一直在考虑将//department[@category='WEB']变成一种新的“根”的方法,例如:通过删除某些内容,然后再次在标题节点上使用.path方法。这似乎不是很“简单”......

1 个答案:

答案 0 :(得分:2)

我最后并没有对基于字符串的hack狂热,但现在这会产生干净的XPath:

require 'nokogiri'
class Nokogiri::XML::Node
  def path_to( node )
    self_ancestors = [self].concat(self.ancestors)
    shared = (self_ancestors & [node].concat(node.ancestors)).first
    [ "../"*self_ancestors.index(shared),
      ".", node.path[shared.path.length..-1] ]
      .join
      .sub( %r{\A\./|/\.(?=/|\z)}, '' ) # remove superfluous "."
  end
end

doc = Nokogiri.XML(IO.read('tmp.rxml'))
n1  = doc.at("//department[@category='WEB']")
n2  = doc.at("//title[.='Learning XML']")
n3  = doc.at("//year[.='2003']")
n4  = doc.at("//@lang")

p n1.path_to(n2) #=> "book/title"
p n2.path_to(n3) #=> "../year"
p n3.path_to(n2) #=> "../title"
p n2.path_to(n1) #=> "../.."
p n1.path_to(n1) #=> "."
p n4.path_to(n2) #=> "../../../../department[2]/book/title"
p n2.path_to(n4) #=> "../../../department[1]/book[1]/title/@lang"

p n2.at( n2.path_to(n4) )==n4 #=> true