在ruby中过滤Simple-XML输出

时间:2015-02-13 09:29:57

标签: ruby xml xml-simple

我是ruby的新手,我正在尝试解析XML结构 过滤一些属性。 XML看起来像这样:

<systeminfo>
<machines>
<machine name="localhost">
<repository worker="localhost:8060" status="OK"/>
<dataengine worker="localhost:27042" status="OK"/>
<serverwebapplication worker="localhost:8000" status="OK"/>
<serverwebapplication worker="localhost:8001" status="OK"/>
<vizqlserver worker="localhost:9100" status="OK"/>
<vizqlserver worker="localhost:9101" status="OK"/>
<dataserver worker="localhost:9700" status="OK"/>
<dataserver worker="localhost:9701" status="OK"/>
<backgrounder worker="localhost:8250" status="OK"/>
<webserver worker="localhost:80" status="OK"/>
</machine>
</machines>
<service status="OK"/>
</systeminfo>

我想检查状态属性是否正常。到目前为止我写的 这段代码:

#!/usr/bin/ruby -w

require 'rubygems'
require 'net/http'
require 'xmlsimple'

url = URI.parse("URL to XML")
req = Net::HTTP::Get.new(url.path)
res = Net::HTTP.start(url.host, url.port) {|http|
http.request(req)
}

sysinfodoc = XmlSimple.xml_in(res.body)


sysinfodoc["machines"][0]["machine"][0].each do |status|
p status[1][0]
p status[1][1]
end

输出:

{"worker"=>"localhost:8060", "status"=>"OK"}
nil
{"worker"=>"localhost:27042", "status"=>"OK"}
nil
{"worker"=>"localhost:9100", "status"=>"OK"}
{"worker"=>"localhost:9101", "status"=>"OK"}
{"worker"=>"localhost:8000", "status"=>"OK"}
{"worker"=>"localhost:8001", "status"=>"OK"}
{"worker"=>"localhost:8250", "status"=>"OK"}
nil
{"worker"=>"localhost:9700", "status"=>"OK"}
{"worker"=>"localhost:9701", "status"=>"OK"}
{"worker"=>"localhost:80", "status"=>"OK"}
nil
108
111

更新 输出应该是这样的:

OK
OK
OK
OK
OK
OK
OK
OK
OK
OK

此脚本应该与nagios一起使用。因此,我想检查其中一个状态属性是否包含以后不“正常”的内容,而不是输出结果。 的更新

如何摆脱nils和fixnums?为什么还有fixnums呢?

我如何过滤这个,以便我得到每个机器孩子的状态? 或者这是一个错误的方法吗?

2 个答案:

答案 0 :(得分:2)

如何使用Nokogiri and XPath呢?

require 'nokogiri'
@doc = Nokogiri::XML(File.open("example.xml"))
@doc.xpath("//machine/*/@status").each { |x| puts x }

结果将是

OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
=> 0

答案 1 :(得分:1)

免责声明:使用Mathias建议的nokogiri和XPath更加优雅和轻松。


一旦遇到意外输出,请尝试打印状态局部变量本身:

sysinfodoc["machines"][0]["machine"][0].each do |status|
  # p status[1][0]
  p status
end

您将看到输出为:

#⇒ ["name", "localhost"]
#⇒ ["repository", [{"worker"=>"localhost:8060", "status"=>"OK"}]]
#⇒ ["dataengine", [{"worker"=>"localhost:27042", "status"=>"OK"}]]
#⇒ ...

那就是说,为了实现你想要的你应该:

▶ sysinfodoc["machines"][0]["machine"][0].values.each do |status|
▷   next unless Array === status
▷   p status.last['status']
▷ end  
# "OK"
# "OK"
# "OK"
# ...

由于存在

,因此需要检查status是否为数组
# ["name", "localhost"]

希望它有所帮助。