使用Nokogiri解析子节点

时间:2015-10-27 09:11:41

标签: ruby nokogiri

我试图用Nokogiri的XPath解析这个XML结构。

<root>
  <resource id='1' name='name1>
     <prices>
         <price datefrom='2015-01-01' dateto='2015-05-31' price='3000' currency='EUR'></price>
         <price datefrom='2015-06-01' dateto='2015-12-31' price='4000' currency='EUR' ></price>                        
     </prices>
  </resource>
  <!-- many more resource nodes -->
<root>

我正在迭代每个资源,对于每个资源,我需要获取其<prices>个元素:

resourcesParsed = Nokogiri::XML(resourcesXML)
    resources = resourcesParsed.xpath("//resource")      
      for resource in resources do
        id = resource["id"]
        # insert in resources tables
        # parsing resource prices
        getPrices(resource)
      end
    ...

def getPrices(resource)
  prices = resource.xpath("//price") 
  @logger.debug "prices=" + prices.to_s
  # do whatever
end 

出于某种原因,当我尝试解析//price时,它不会仅获取资源中的<price>个节点,而是获取整个XML文档中的所有<prices>个节点。

如何只解析资源的<price>个节点?

2 个答案:

答案 0 :(得分:1)

我明白了。

而不是:

prices = resource.xpath("//price") 

我应该搜索:

prices = resource.xpath(".//price") 

指向当前节点。

答案 1 :(得分:1)

I'd write the code like: resources = doc.search('resource').map{ |resource| [ resource['id'], resource.search('price').map{ |price| { price: price['price'], datefrom: price['datefrom'], dateto: price['dateto'], currency: price['currency'] } } ] } At this point resources is an array of arrays of hashes, each sub-array is a resource with its embedded prices: # => [["1", # [{:price=>"3000", # :datefrom=>"2015-01-01", # :dateto=>"2015-05-31", # :currency=>"EUR"}, # {:price=>"4000", # :datefrom=>"2015-06-01", # :dateto=>"2015-12-31", # :currency=>"EUR"}]]] It'd be a little more easy to reuse that for lookups or further processing if it's a hash of sub-arrays, where each sub-array is a price: resources.to_h # => {"1"=> # [{:price=>"3000", # :datefrom=>"2015-01-01", # :dateto=>"2015-05-31", # :currency=>"EUR"}, # {:price=>"4000", # :datefrom=>"2015-06-01", # :dateto=>"2015-12-31", # :currency=>"EUR"}]}