使用nokogiri,只需调用node.path
即可轻松获得从任何节点返回根节点的绝对路径。举个例子:
<bookstore>
<department category="COOKING">
<book>
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book>
<title lang="en">Nice meals</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</department>
<department category="WEB">
<book>
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</department>
</bookstore>
如果我做了
tree.search("//title[text() = 'Learning XML']").first.path
我会得到类似bookstore/department[2]/book[1]/title[1]
现在如果我想获得这个节点的路径,但是不是从根节点,我想从例如//department[@category='WEB']
并一直到同一个标题节点?
换句话说。如何在//department[@category='WEB']
到bookstore/department[2]/book[1]/title[1]
两个已知节点之间一般获取/生成路径?
修改
我一直在考虑将//department[@category='WEB']
变成一种新的“根”的方法,例如:通过删除某些内容,然后再次在标题节点上使用.path
方法。这似乎不是很“简单”......
答案 0 :(得分:2)
我最后并没有对基于字符串的hack狂热,但现在这会产生干净的XPath:
require 'nokogiri'
class Nokogiri::XML::Node
def path_to( node )
self_ancestors = [self].concat(self.ancestors)
shared = (self_ancestors & [node].concat(node.ancestors)).first
[ "../"*self_ancestors.index(shared),
".", node.path[shared.path.length..-1] ]
.join
.sub( %r{\A\./|/\.(?=/|\z)}, '' ) # remove superfluous "."
end
end
doc = Nokogiri.XML(IO.read('tmp.rxml'))
n1 = doc.at("//department[@category='WEB']")
n2 = doc.at("//title[.='Learning XML']")
n3 = doc.at("//year[.='2003']")
n4 = doc.at("//@lang")
p n1.path_to(n2) #=> "book/title"
p n2.path_to(n3) #=> "../year"
p n3.path_to(n2) #=> "../title"
p n2.path_to(n1) #=> "../.."
p n1.path_to(n1) #=> "."
p n4.path_to(n2) #=> "../../../../department[2]/book/title"
p n2.path_to(n4) #=> "../../../department[1]/book[1]/title/@lang"
p n2.at( n2.path_to(n4) )==n4 #=> true