nokogiri id中的每个属性 - 两个节点中的@id

时间:2016-06-24 11:23:56

标签: ruby xml nokogiri

我有一个遵循这种结构的XML

<meeting id="42977">
<race id="215411">
  <nomination number="8" saddlecloth="8" horse="Chipanda" id="198926" />
  <nomination number="2" saddlecloth="2" horse="Chifries" id="198965" />
  <nomination number="1" saddlecloth="1" horse="Itpanda" id="199260" />
</race>
<race id="215412">
  <nomination number="1" saddlecloth="1" horse="Ruby" id="199634" />
  <nomination number="2" saddlecloth="2" horse="Gems" id="208926" />
  <nomination number="3" saddlecloth="3" horse="Rock" id="122923" />
</race>
</meeting>

我希望能够提取提名id属性,并将其与其来自的种族ID相关联,以便打印输出,我真的会将其放入数据库中。

RaceID    NomID
215411    198926
215411    198965
215411    199260
215412    199634
215412    208926
215412    122923

尝试了一些不同的路线,但无法使用css选择器(例如@ doc.css('race id')实际收集文件中作为种族后代的所有ID。

这是我到目前为止的地方

require 'nokogiri'

@doc = Nokogiri::XML(File.open("data/20160521RHIL0.xml"))
#puts @doc.xpath("//race/nomination/@horse")
race = @doc.xpath("//race")
#nom_id = @doc.xpath("//race/nomination/@id")

race.each do |f|
  f.xpath('//@id | //nomination/@id').each do |node|
    puts node['V']
  end
end
#node_num = race_id.length
#(1..node_num).each do |x|
  #nom_id.each do |y|
    #puts "race ID\t" + "#{race_id[x]}  " + "nom_id\t" + "#{y}"
#end
#end

3 个答案:

答案 0 :(得分:1)

不知道酷选择器,但你可以做这样的事情

races = xml.xpath('//race')
races.map do |race|
  race_id = race.xpath('./@id').text.to_i
  nomination_ids = race.xpath('./nomination/@id').map { |id| id.text.to_i }
  nomination_ids.map do |nomination_id|
    { race_id: race_id, nomination_id: nomination_id }
  end
end.flatten

这将返回array哈希,如

[
  {:race_id => 215411, :nomination_id => 198926},
  {:race_id => 215411, :nomination_id => 198965},
  {:race_id => 215411, :nomination_id => 199260},
  {:race_id => 215412, :nomination_id => 199634},
  {:race_id => 215412, :nomination_id => 208926},
  {:race_id => 215412, :nomination_id => 122923}
]

答案 1 :(得分:1)

我喜欢使用hash之类的导出,因此您可以使用类似的内容。

xml = '<meeting id="42977">
<race id="215411">
<nomination number="8" saddlecloth="8" horse="Chipanda" id="198926" />
<nomination number="2" saddlecloth="2" horse="Chifries" id="198965" />
<nomination number="1" saddlecloth="1" horse="Itpanda" id="199260" />
<race id="215412">
<nomination number="1" saddlecloth="1" horse="Ruby" id="199634" />
<nomination number="2" saddlecloth="2" horse="Gems" id="208926" />
<nomination number="3" saddlecloth="3" horse="Rock" id="122923" />
</race>
</meeting>'

require 'nokogiri'

doc = Nokogiri::XML(xml)

races = doc.xpath("//race")

export = races.each_with_object(Hash.new { |k, v| k[v] = [] }) do |elem, exp|
  elem.xpath("./nomination").each do |nom_elem|
    exp[elem['id']] << nom_elem['id']
  end
end

<强>输出

p export

# {
#   "215411"=>["198926", "198965", "199260"],
#   "215412"=>["199634", "208926", "122923"]
# }

我希望这会有所帮助

答案 2 :(得分:1)

我只尝试另一个只有一个循环的解决方案。

xml = '<meeting id="42977">
<race id="215411">
<nomination number="8" saddlecloth="8" horse="Chipanda" id="198926" />
<nomination number="2" saddlecloth="2" horse="Chifries" id="198965" />
<nomination number="1" saddlecloth="1" horse="Itpanda" id="199260" />
<race id="215412">
<nomination number="1" saddlecloth="1" horse="Ruby" id="199634" />
<nomination number="2" saddlecloth="2" horse="Gems" id="208926" />
<nomination number="3" saddlecloth="3" horse="Rock" id="122923" />
</race>
</meeting>'

require 'nokogiri'

doc = Nokogiri::XML(xml)

nominations = doc.xpath("//nomination")

export = nominations.each_with_object(Hash.new { |k, v| k[v] = [] }) do |non_elem, exp|
  exp[non_elem.parent['id']] << non_elem['id']
end

<强>输出

p export

# {
#   "215411"=>["198926", "198965", "199260"],
#   "215412"=>["199634", "208926", "122923"]
# }