此文档是防火墙配置的输出。我正在尝试构建一个防火墙规则的哈希。我稍后会将这些数据输出到CSV / console /我需要的任何内容:
<table index="44" title=" from PUBLIC to DMZ administrative service rules on Firewall01" ref="FILTER.BLACKLIST">
<headings>
<heading>Rule</heading>
<heading>Action</heading>
<heading>Source</heading>
<heading>Destination</heading>
<heading>Service</heading>
<heading>Log</heading>
</headings>
<tablebody>
<tablerow>
<tablecell><item>test_inbound</item></tablecell>
<tablecell><item>Allow</item></tablecell>
<tablecell><item gotoref="CONFIG.3.452">[Group] test_b2_group</item></tablecell>
<tablecell><item>[Host] Any</item></tablecell>
<tablecell><item>[Host] Any</item></tablecell>
<tablecell><item>Yes</item></tablecell>
</tablerow>
<tablerow>
<tablecell><item>host02_inbound</item></tablecell>
<tablecell><item>Allow</item></tablecell>
<tablecell><item gotoref="CONFIG.3.447">[Group] host02_group</item></tablecell>
<tablecell><item>[Host] Any</item></tablecell>
<tablecell><item>[Host] Any</item></tablecell>
<tablecell><item>Yes</item></tablecell>
</tablerow>
<tablerow>
<tablecell><item>randomhost</item></tablecell>
<tablecell><item>Allow</item></tablecell>
**<tablecell><item gotoref="CONFIG.3.383">[Group] Host_group_2</item><item gotoref="CONFIG.3.382">[Group] another_server</item></tablecell>**
<tablecell><item gotoref="CONFIG.3.510">[Group] crazy_application</item><item gotoref="CONFIG.3.511">[Group] internal_app</item><item gotoref="CONFIG.3.525">[Group] online_application</item></tablecell>
<tablecell><item gotoref="CONFIG.3.783">[Group] junos-https</item></tablecell>
<tablecell><item>No</item></tablecell>
</tablerow>
</tablebody>
</table>
我们有列标题和三个防火墙规则。
这是我的代码:
#!/usr/bin/env ruby
require 'nokogiri'
require 'csv'
fwpol = File.open(ARGV[0]) { |f| Nokogiri::XML(f) }
rule_array = []
fwpol.xpath('./table/tablebody/tablerow').each do |item|
rules = {}
rules[:name] = item.xpath('./tablecell/item')[0].text
rules[:action] = item.xpath('./tablecell/item')[1].text
rules[:source] = item.xpath('./tablecell/item')[2].text
rule_array << rules
end
puts rule_array
前两个哈希条目:name
和:action
工作正常,因为这些字段中只有一个值。
如果我运行代码,则不会在有多个值的地方打印。粗体的XML行显示了我所指的内容。我需要以某种方式迭代这些值,但到目前为止,我的尝试都没有结果。
答案 0 :(得分:2)
您可以通过以下方式将多个元素文本作为数组获取。
require 'nokogiri'
require 'csv'
fwpol = File.open(ARGV[0]) { |f| Nokogiri::XML(f) }
rule_array = []
fwpol.xpath('./table/tablebody/tablerow').each do |item|
rules = {}
rules[:name] = item.xpath('./tablecell[1]/item').text
rules[:action] = item.xpath('./tablecell[2]/item').text
rules[:source] = item.xpath('./tablecell[3]/item').map(&:text)
rule_array << rules
end
puts rule_array
输出就在这里。
{:name=>"test_inbound", :action=>"Allow", :source=>["[Group] test_b2_group"]}
{:name=>"host02_inbound", :action=>"Allow", :source=>["[Group] host02_group"]}
{:name=>"randomhost", :action=>"Allow", :source=>["[Group] Host_group_2", "[Group] another_server"]}
答案 1 :(得分:1)
我会做这样的事情:
require 'nokogiri'
doc = Nokogiri::XML(<<EOT)
<table index="44" title=" from PUBLIC to DMZ administrative service rules on Firewall01" ref="FILTER.BLACKLIST">
<tablebody>
<tablerow>
<tablecell><item>test_inbound</item></tablecell>
<tablecell><item>Allow</item></tablecell>
<tablecell><item gotoref="CONFIG.3.452">[Group] test_b2_group</item></tablecell>
<tablecell><item>[Host] Any</item></tablecell>
<tablecell><item>[Host] Any</item></tablecell>
<tablecell><item>Yes</item></tablecell>
</tablerow>
<tablerow>
<tablecell><item>randomhost</item></tablecell>
<tablecell><item>Allow</item></tablecell>
<tablecell><item gotoref="CONFIG.3.383">[Group] Host_group_2</item><item gotoref="CONFIG.3.382">[Group] another_server</item></tablecell>
<tablecell><item gotoref="CONFIG.3.510">[Group] crazy_application</item><item gotoref="CONFIG.3.511">[Group] internal_app</item><item gotoref="CONFIG.3.525">[Group] online_application</item></tablecell>
<tablecell><item gotoref="CONFIG.3.783">[Group] junos-https</item></tablecell>
<tablecell><item>No</item></tablecell>
</tablerow>
</tablebody>
</table>
EOT
rule_array = doc.search('tablerow').map{ |row|
name, action, source = row.search('tablecell')[0, 3].map{ |tc| tc.search('item').map(&:text) }
{
name: name,
action: action,
source: source
}
}
其中,运行时会返回包含哈希数组的rule_array
,其中最后一个包含两个item
条目:
require 'ap'
ap rule_array
# >> [
# >> [0] {
# >> :name => [
# >> [0] "test_inbound"
# >> ],
# >> :action => [
# >> [0] "Allow"
# >> ],
# >> :source => [
# >> [0] "[Group] test_b2_group"
# >> ]
# >> },
# >> [1] {
# >> :name => [
# >> [0] "randomhost"
# >> ],
# >> :action => [
# >> [0] "Allow"
# >> ],
# >> :source => [
# >> [0] "[Group] Host_group_2",
# >> [1] "[Group] another_server"
# >> ]
# >> }
# >> ]
注意:不要这样做:
fwpol = File.open(ARGV[0]) { |f| Nokogiri::XML(f) }
使用起来更简单:
fwpol = Nokogiri::XML(File.read(ARGV[0]))
而不是:
item.xpath('./tablecell/item')[0].text
item.xpath('./tablecell/item')[1].text
item.xpath('./tablecell/item')[2].text
只需找到tablecell标签一次,然后将所需的标签切片:[0, 3]
,然后迭代该小组。它更快,减少了代码的重复。
另请参阅“How to avoid joining all text from Nodes when scraping”。