我正在从像这样返回xml的api中获取数据:
<?xml version="1.0" encoding="utf-8" ?> <seriess realtime_start="2013-01-28" realtime_end="2013-01-28"> <series id="GDPC1" realtime_start="2013-01-28" realtime_end="2013-01-28" title="Real Gross Domestic Product, 1 Decimal" observation_start="1947-01-01" observation_end="2012-07-01" frequency="Quarterly" frequency_short="Q" units="Billions of Chained 2005 Dollars" units_short="Bil. of Chn. 2005 $" seasonal_adjustment="Seasonally Adjusted Annual Rate" seasonal_adjustment_short="SAAR" last_updated="2012-12-20 08:16:28-06" popularity="93" notes="Real gross domestic product is the inflation adjusted value of the goods and services produced by labor and property located in the United States. For more information see the Guide to the National Income and Product Accounts of the United States (NIPA) - (http://www.bea.gov/national/pdf/nipaguid.pdf)"/> </seriess>
我是反序列化的新手,但我认为合适的是将这个xml解析成一个ruby对象,然后我可以引用像objectFoo.seriess.series.frequency那样返回'Quarterly'。
从我在这里和谷歌的搜索中,似乎没有一个明显的解决方案在Ruby(NOT rails),这让我觉得我错过了一些相当明显的东西。有什么想法吗?
修改 我根据Winfield的建议设置了一个测试用例。
class Exopenstruct
require 'ostruct'
def initialize()
hash = {"seriess"=>{"realtime_start"=>"2013-02-01", "realtime_end"=>"2013-02-01", "series"=>{"id"=>"GDPC1", "realtime_start"=>"2013-02-01", "realtime_end"=>"2013-02-01", "title"=>"Real Gross Domestic Product, 1 Decimal", "observation_start"=>"1947-01-01", "observation_end"=>"2012-10-01", "frequency"=>"Quarterly", "frequency_short"=>"Q", "units"=>"Billions of Chained 2005 Dollars", "units_short"=>"Bil. of Chn. 2005 $", "seasonal_adjustment"=>"Seasonally Adjusted Annual Rate", "seasonal_adjustment_short"=>"SAAR", "last_updated"=>"2013-01-30 07:46:54-06", "popularity"=>"93", "notes"=>"Real gross domestic product is the inflation adjusted value of the goods and services produced by labor and property located in the United States.\n\nFor more information see the Guide to the National Income and Product Accounts of the United States (NIPA) - (http://www.bea.gov/national/pdf/nipaguid.pdf)"}}}
object_instance = OpenStruct.new( hash )
end
end
在irb中我加载了rb文件并实例化了该类。但是,当我尝试访问一个属性(例如instance.seriess)时,我收到了:NoMethodError:undefined method`seriess'
如果我遗漏了一些明显的东西,再次道歉。
答案 0 :(得分:14)
使用标准XML进行散列解析可能会更好,例如Rails中包含的内容:
object_hash = Hash.from_xml(xml_string)
puts object_hash['seriess']
如果您没有使用Rails堆栈,您可以使用像Nokogiri这样的库来实现相同的行为。
编辑:如果你正在寻找对象行为,使用OpenStruct是一个很好的方法来包装哈希:
object_instance = OpenStruct.new( Hash.from_xml(xml_string) )
puts object_instance.seriess
注意:对于深度嵌套的数据,您可能还需要递归地将嵌入的哈希值转换为OpenStruct实例。 IE:如果上面的属性是值的散列,则它将是散列而不是OpenStruct。
答案 1 :(得分:4)
我刚刚开始使用Damien Le Berrigaud's fork of HappyMapper,我对此非常满意。您定义了简单的Ruby类和include HappyMapper
。当你调用parse
时,它会使用Nokogiri来填充XML,然后你会得到一个完整的真实Ruby对象树。
我用它来解析多兆字节的XML文件,发现它快速可靠。查看README。
一个提示:由于XML文件编码字符串有时存在,您可能需要像这样清理XML:
def sanitize(xml)
xml.encode('UTF-8', 'binary', invalid: :replace, undef: :replace, replace: '')
end
在将其传递给#parse方法之前,以避免Nokogiri的Input is not proper UTF-8, indicate encoding !
错误。
我继续将OP的示例转换为HappyMapper:
XML_STRING = '<?xml version="1.0" encoding="utf-8" ?> <seriess realtime_start="2013-01-28" realtime_end="2013-01-28"> <series id="GDPC1" realtime_start="2013-01-28" realtime_end="2013-01-28" title="Real Gross Domestic Product, 1 Decimal" observation_start="1947-01-01" observation_end="2012-07-01" frequency="Quarterly" frequency_short="Q" units="Billions of Chained 2005 Dollars" units_short="Bil. of Chn. 2005 $" seasonal_adjustment="Seasonally Adjusted Annual Rate" seasonal_adjustment_short="SAAR" last_updated="2012-12-20 08:16:28-06" popularity="93" notes="Real gross domestic product is the inflation adjusted value of the goods and services produced by labor and property located in the United States. For more information see the Guide to the National Income and Product Accounts of the United States (NIPA) - (http://www.bea.gov/national/pdf/nipaguid.pdf)"/> </seriess>'
class Series; end; # fwd reference
class Seriess
include HappyMapper
tag 'seriess'
attribute :realtime_start, Date
attribute :realtime_end, Date
has_many :seriess, Series, :tag => 'series'
end
class Series
include HappyMapper
tag 'series'
attribute 'id', String
attribute 'realtime_start', Date
attribute 'realtime_end', Date
attribute 'title', String
attribute 'observation_start', Date
attribute 'observation_end', Date
attribute 'frequency', String
attribute 'frequency_short', String
attribute 'units', String
attribute 'units_short', String
attribute 'seasonal_adjustment', String
attribute 'seasonal_adjustment_short', String
attribute 'last_updated', DateTime
attribute 'popularity', Integer
attribute 'notes', String
end
def test
Seriess.parse(XML_STRING, :single => true)
end
以下是您可以用它做的事情:
>> a = test
>> a.class
Seriess
>> a.seriess.first.frequency
=> "Quarterly"
>> a.seriess.first.observation_start
=> #<Date: 1947-01-01 ((2432187j,0s,0n),+0s,2299161j)>
>> a.seriess.first.popularity
=> 93
答案 2 :(得分:1)
Nokogiri解决了这个问题。如何处理数据取决于您,在这里我以OpenStruct
为例:
require 'nokogiri'
require 'ostruct'
require 'open-uri'
doc = Nokogiri.parse open('http://www.w3schools.com/xml/note.xml')
note = OpenStruct.new
note.to = doc.at('to').text
note.from = doc.at('from').text
note.heading = doc.at('heading').text
note.body = doc.at('body').text
=> #<OpenStruct to="Tove", from="Jani", heading="Reminder", body="ToveJaniReminderDon't forget me this weekend!\r\n">
这只是一个预告片,你的问题幅度可能要大很多倍。只是给你一个优势,开始使用
编辑在谷歌和stackoverflow上遇到困难我的回答和 @Winfield 使用rails Hash#from_xml
之间可能出现混合:
> require 'active_support/core_ext/hash/conversions'
> xml = Nokogiri::XML.parse(open('http://www.w3schools.com/xml/note.xml'))
> Hash.from_xml(xml.to_s)
=> {"note"=>{"to"=>"Tove", "from"=>"Jani", "heading"=>"Reminder", "body"=>"Don't forget me this weekend!"}}
然后你可以使用这个哈希来,例如,初始化一个新的ActiveRecord :: Base模型实例或你决定用它做的其他事情。
http://nokogiri.org/
http://ruby-doc.org/stdlib-1.9.3/libdoc/ostruct/rdoc/OpenStruct.html
https://stackoverflow.com/a/7488299/1740079
答案 3 :(得分:0)
如果你想将xml转换为Hash,我发现https://cloud.google.com/logging/docs/view/service/appengine-logs#linking_application_logs_and_requests gem是最简单的。
示例:
require 'nori'
xml = '<?xml version="1.0" encoding="utf-8" ?> <seriess realtime_start="2013-01-28" realtime_end="2013-01-28"> <series id="GDPC1" realtime_start="2013-01-28" realtime_end="2013-01-28" title="Real Gross Domestic Product, 1 Decimal" observation_start="1947-01-01" observation_end="2012-07-01" frequency="Quarterly" frequency_short="Q" units="Billions of Chained 2005 Dollars" units_short="Bil. of Chn. 2005 $" seasonal_adjustment="Seasonally Adjusted Annual Rate" seasonal_adjustment_short="SAAR" last_updated="2012-12-20 08:16:28-06" popularity="93" notes="Real gross domestic product is the inflation adjusted value of the goods and services produced by labor and property located in the United States. For more information see the Guide to the National Income and Product Accounts of the United States (NIPA) - (http://www.bea.gov/national/pdf/nipaguid.pdf)"/> </seriess>'
hash = Nori.new.parse(xml)
hash['seriess']
hash['seriess']['series']
puts hash['seriess']['series']['@frequency']
注意'@'用于频率,因为它是'series'的属性而不是元素。