Question

我有这个XML：

<record>
    <f id="27">John Smith</f>
    <f id="28"/>
</record>

我用这种方式用Nokogiri解析它：

# I get the record from the whole document
... 
fields = record.xpath("f")
for field in fields
    puts field.content
end

返回：

John Smith
\n 28 \n

哪个不对。第二个field标记在标记内没有任何内容，它应该返回一个空值。对？有什么帮助吗？

顺便说一句，LibXML也会发生同样的事情。

修改

实际代码：

xml = Nokogiri::XML("<?xml version="1.0" ?><records><record><f id="27">John Smith</f><f id="38"/></record></records>")

records = xml.xpath("//record")
records.map{|record|
    fields = record.xpath("f")
    fields.to_enum(:each_with_index).collect{|field,index|
        [field.content, index]
    }
}

Answer 1

我会回答这个问题。标签可能包含您可能错过的其他标签。

Answer 2

您的xpath访问者错误：

require 'nokogiri'

doc = Nokogiri::XML(<<EOT)
<record>
    <f id="27">John Smith</f>
    <f id="28"/>
</record>
EOT

puts doc.xpath('f').size # => 0
puts doc.xpath('//f').size # => 2

puts doc.xpath('//f[@id="27"]').size # => 1
puts doc.xpath('//f[@id="27"]').first.text # => "John Smith"
puts doc.at('//f').text # => "John Smith"

Nokogiri始终使用xpath，css和search方法以及at及其别名的节点返回NodeSet。将NodeSet视为数组。

doc.xpath('//f[@id="27"]').class # => Nokogiri::XML::NodeSet < Object
doc.at('//f[@id="27"]').class # => Nokogiri::XML::Element < Nokogiri::XML::Node

Nokogiri - tag.contents返回错误数据

2 个答案: