基本上我想提取从节点到root的绝对路径,并将其报告给控制台或文件。以下是当前的解决方案:
require "rexml/document"
include REXML
def get_path(xml_doc, key)
XPath.each(xml_doc, key) do |node|
puts "\"#{node}\""
XPath.each(node, '(ancestor::#node)') do |el|
# puts el
end
end
end
test_doc = Document.new <<EOF
<root>
<level1 key="1" value="B">
<level2 key="12" value="B" />
<level2 key="13" value="B" />
</level1>
</root>
EOF
get_path test_doc, "//*/[@key='12']"
问题是它给了我"<level2 value='B' key='12'/>"
作为输出。期望的输出是<root><level1><level2 value='B' key='12'/>
(格式可能不同,主要目标是拥有完整路径)。我只有XPath的基本知识,并希望得到任何帮助/指导,以及如何实现这一目标。
答案 0 :(得分:4)
这应该让你开始:
require 'nokogiri'
test_doc = Nokogiri::XML <<EOF
<root>
<level1 key="1" value="B">
<level2 key="12" value="B" />
<level2 key="13" value="B" />
</level1>
</root>
EOF
node = test_doc.at('//level2')
puts [*node.ancestors.reverse, node][1..-1].map{ |n| "<#{ n.name }>" }
# >> <root>
# >> <level1>
# >> <level2>
Nokogiri非常好,因为它可以让你使用CSS访问器而不是XPath,如果你愿意的话。 CSS对某些人来说更直观,并且比同等的XPath更清晰:
node = test_doc.at('level2')
puts [*node.ancestors.reverse, node][1..-1].map{ |n| "<#{ n.name }>" }
# >> <root>
# >> <level1>
# >> <level2>
答案 1 :(得分:3)
首先,请注意您的文档不是我想要的。我怀疑您不希望<level1>
自我关闭,而是将<level2>
元素包含为子项。
其次,我更喜欢并提倡Nokogiri而不是REXML。很高兴REXML附带Ruby,但Nokogiri更快更方便,恕我直言。所以:
require 'nokogiri'
test_doc = Nokogiri::XML <<EOF
<root>
<level1 key="1" value="B">
<level2 key="12" value="B" />
<level2 key="13" value="B" />
</level1>
</root>
EOF
def get_path(xml_doc, key)
xml_doc.at_xpath(key).ancestors.reverse
end
path = get_path( test_doc, "//*[@key='12']" )
p path.map{ |node| node.name }.join( '/' )
#=> "document/root/level1"
答案 2 :(得分:2)
如果您设置了REXML,这是一个REXML解决方案:
require 'rexml/document'
test_doc = REXML::Document.new <<EOF
<root>
<level1 key="1" value="B">
<level2 key="12" value="B" />
<level2 key="13" value="B" />
</level1>
</root>
EOF
def get_path(xml_doc, key)
node = REXML::XPath.first( xml_doc, key )
path = []
while node.parent
path << node
node = node.parent
end
path.reverse
end
path = get_path( test_doc, "//*[@key='12']" )
p path.map{ |el| el.name }.join("/")
#=> "root/level1/level2"
或者,如果您想使用其他答案中的相同get_path
实现,您可以monkeypatch REXML添加ancestors
方法:
class REXML::Child
def ancestors
ancestors = []
# Presumably you don't want the node included in its list of ancestors
# If you do, change the following line to node = self
node = self.parent
# Presumably you want to stop at the root node, and not its owning document
# If you want the document included in the ancestors, change the following
# line to just while node
while node.parent
ancestors << node
node = node.parent
end
ancestors.reverse
end
end