在Ruby中解析XML标记的内容

时间:2016-05-03 17:05:09

标签: ruby xml parsing nokogiri

我有一个XML,据我所知,它已经被标签解析了。我的目标是解析<GetResidentsContactInfoResult>标记中的所有信息。在下面的示例xml的这个标记中,这里有两个记录,每个记录都以Lease PropertyId键开头。如何迭代<GetResidentsContactInfoResult>标记并打印出每条记录的键/值对?我是Ruby的新手并使用XML文件,这是我可以用Nokogiri做的吗?

<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
   <soap:Body>
      <GetResidentsContactInfoResponse xmlns="http://tempuri.org/">
         <GetResidentsContactInfoResult>&lt;PropertyResidents&gt;&lt;Lease PropertyId="21M" BldgID="00" UnitID="0903" ResiID="3" occustatuscode="P" occustatuscodedescription="Previous" MoveInDate="2016-01-07T00:00:00" MoveOutDate="2016-02-06T00:00:00" LeaseBeginDate="2016-01-07T00:00:00" LeaseEndDate="2017-01-31T00:00:00" MktgSource="DBY" PrimaryEmail="noemail1@fake.com"&gt;&lt;Occupant PropertyId="21M" BldgID="00" UnitID="0903" ResiID="3" OccuSeqNo="3444755" OccuFirstName="Efren" OccuLastName="Cerda" Phone2No="(832) 693-9448" ResponsibleFlag="Responsible" /&gt;&lt;/Lease&gt;&lt;Lease PropertyId="21M" BldgID="00" UnitID="0908" ResiID="2" occustatuscode="P" occustatuscodedescription="Previous" MoveInDate="2016-02-20T00:00:00" MoveOutDate="2016-04-25T00:00:00" LeaseBeginDate="2016-02-20T00:00:00" LeaseEndDate="2017-02-28T00:00:00" MktgSource="PW" PrimaryEmail="noemail1@fake.com"&gt;&lt;Occupant PropertyId="21M" BldgID="00" UnitID="0908" ResiID="2" OccuSeqNo="3451301" OccuFirstName="Donna" OccuLastName="Mclean" Phone2No="(713) 785-4240" ResponsibleFlag="Responsible" /&gt;&lt;/Lease&gt;&lt;/PropertyResidents&gt;</GetResidentsContactInfoResult>
      </GetResidentsContactInfoResponse>
   </soap:Body>
</soap:Envelope>

1 个答案:

答案 0 :(得分:2)

这使用Nokogiri查找所有GetResidentsContactInfoResponse元素,然后使用Active Support将内部文本转换为键值对的哈希值。

我认为你和Nokogiri一样没问题,正如你在问题中提到的那样。

如果您不想使用有效支持,请考虑查看&#34; Convert a Nokogiri document to a Ruby Hash&#34;作为第Hash.from_xml(elm.text)行的替代:

# Needed in order to use the `Hash.from_xml`
require 'active_support/core_ext/hash/conversions'

def find_key_values(str)
  doc = Nokogiri::XML(str)

  # Ignore namespaces for easier traversal
  doc.remove_namespaces!
  doc.css('GetResidentsContactInfoResponse').map do |elm|
    Hash.from_xml(elm.text)
  end
end

用法:

# Option 1: if your XML above is stored in a variable called `string`
find_key_values string

# Option 2: if your XML above is stored in a file
find_key_values File.open('/path/to/file')

返回:

[{"PropertyResidents"=>
   {"Lease"=>
     [{"PropertyId"=>"21M",
       "BldgID"=>"00",
       "UnitID"=>"0903",
       "ResiID"=>"3",
       "occustatuscode"=>"P",
       "occustatuscodedescription"=>"Previous",
       "MoveInDate"=>"2016-01-07T00:00:00",
       "MoveOutDate"=>"2016-02-06T00:00:00",
       "LeaseBeginDate"=>"2016-01-07T00:00:00",
       "LeaseEndDate"=>"2017-01-31T00:00:00",
       "MktgSource"=>"DBY",
       "PrimaryEmail"=>"noemail1@fake.com",
       "Occupant"=>
        {"PropertyId"=>"21M",
         "BldgID"=>"00",
         "UnitID"=>"0903",
         "ResiID"=>"3",
         "OccuSeqNo"=>"3444755",
         "OccuFirstName"=>"Efren",
         "OccuLastName"=>"Cerda",
         "Phone2No"=>"(832) 693-9448",
         "ResponsibleFlag"=>"Responsible"}},
      {"PropertyId"=>"21M",
       "BldgID"=>"00",
       "UnitID"=>"0908",
       "ResiID"=>"2",
       "occustatuscode"=>"P",
       "occustatuscodedescription"=>"Previous",
       "MoveInDate"=>"2016-02-20T00:00:00",
       "MoveOutDate"=>"2016-04-25T00:00:00",
       "LeaseBeginDate"=>"2016-02-20T00:00:00",
       "LeaseEndDate"=>"2017-02-28T00:00:00",
       "MktgSource"=>"PW",
       "PrimaryEmail"=>"noemail1@fake.com",
       "Occupant"=>
        {"PropertyId"=>"21M",
         "BldgID"=>"00",
         "UnitID"=>"0908",
         "ResiID"=>"2",
         "OccuSeqNo"=>"3451301",
         "OccuFirstName"=>"Donna",
         "OccuLastName"=>"Mclean",
         "Phone2No"=>"(713) 785-4240",
         "ResponsibleFlag"=>"Responsible"}}]}}]