Nokogiri XML搜索

时间:2013-06-17 17:29:44

标签: ruby nokogiri

我已经尝试过阅读Nokogiri文档等,但我来到了路障。

我得到类似于

的XML输出
<?xml version="1.0"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
  <soap:Body>
    <ns1:getPoliciesResponse xmlns:ns1="http://policy.api.control.r1soft.com/">
      <return>
        <CDPId>XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXx</CDPId>
        <description/>
        <diskSafeID>bcb68765-a719-4291-912d-2e6af485ea24</diskSafeID>
        <enabled>true</enabled>
        <id>cdb65427-d6f4-4a89-9f77-8763e22dc74b</id>
        <lastReplicationRunTime>2013-06-12T13:29:40.105-05:00</lastReplicationRunTime>
        <name>pstueck-passenger ondemand</name>
        <replicationScheduleFrequencyType>ON_DEMAND</replicationScheduleFrequencyType>
        <state>OK</state>
      </return>
      <return>
        <CDPId>XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXx</CDPId>
        <description/>
        <diskSafeID>e8e13555-f577-40d2-99c8-fa8a019d3b55</diskSafeID>
        <enabled>true</enabled>
        <id>7f55f8d6-92a9-4b14-bff4-631559d92259</id>
        <lastReplicationRunTime>2013-06-16T22:00:04.918-05:00</lastReplicationRunTime>
        <name>pstueck-mysql daily</name>
        <nextReplicationRunTime>2013-06-17T22:00:00-05:00</nextReplicationRunTime>
        <replicationScheduleFrequencyType>DAILY</replicationScheduleFrequencyType>
        <state>ALERT</state>
        <warnings>Policy last completed with alerts</warnings>
      </return>
    </ns1:getPoliciesResponse>
  </soap:Body>
</soap:Envelope>

但我有大量的'返回'部分会显示出来。我正在尝试在字符串末尾使用.search。我只想让它返回给定'name'的整个'return'部分。有人有任何提示吗?

当前代码:

client = Savon::Client.new do
  http.auth.basic "#{opts['api_username']}", "#{opts['api_password']}"
  wsdl.document = "#{opts['api_url']}/Policy?wsdl"
end

getPolicyInformation = client.request :getPolicies
getPolicyInformation = Nokogiri::XML(getPolicyInformation.to_xml)
print getPolicyInformation

如果我搜索指定的<return>,我想要返回<name>部分中的所有内容。示例:我只想查看与<name>pstueck-passenger ondemand</name>相关的信息,但是要查看包含该信息的整个<return>部分。

2 个答案:

答案 0 :(得分:1)

您可以使用XPath来标识具有特定值的节点,然后通过执行以下操作来指定感兴趣的祖先元素:

require 'nokogiri'

document = <<-XML
<?xml version="1.0"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
  <soap:Body>
    <ns1:getPoliciesResponse xmlns:ns1="http://policy.api.control.r1soft.com/">
      <return>
        <CDPId>XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXx</CDPId>
        <description/>
        <diskSafeID>bcb68765-a719-4291-912d-2e6af485ea24</diskSafeID>
        <enabled>true</enabled>
        <id>cdb65427-d6f4-4a89-9f77-8763e22dc74b</id>
        <lastReplicationRunTime>2013-06-12T13:29:40.105-05:00</lastReplicationRunTime>
        <name>pstueck-passenger ondemand</name>
        <replicationScheduleFrequencyType>ON_DEMAND</replicationScheduleFrequencyType>
        <state>OK</state>
      </return>
      <return>
        <CDPId>XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXx</CDPId>
        <description/>
        <diskSafeID>e8e13555-f577-40d2-99c8-fa8a019d3b55</diskSafeID>
        <enabled>true</enabled>
        <id>7f55f8d6-92a9-4b14-bff4-631559d92259</id>
        <lastReplicationRunTime>2013-06-16T22:00:04.918-05:00</lastReplicationRunTime>
        <name>pstueck-mysql daily</name>
        <nextReplicationRunTime>2013-06-17T22:00:00-05:00</nextReplicationRunTime>
        <replicationScheduleFrequencyType>DAILY</replicationScheduleFrequencyType>
        <state>ALERT</state>
        <warnings>Policy last completed with alerts</warnings>
      </return>
    </ns1:getPoliciesResponse>
  </soap:Body>
</soap:Envelope>
XML

doc = Nokogiri::XML(document)
ns = { 'soap' => 'http://schemas.xmlsoap.org/soap/envelope/', 'ns1' => "http://policy.api.control.r1soft.com/" }
ret = doc.xpath('/soap:Envelope/soap:Body/ns1:getPoliciesResponse/return/name[text()="pstueck-passenger ondemand"]/ancestor::return', ns)

puts ret.count
puts ret.at('replicationScheduleFrequencyType').text

修改

更新以反映更新的XML正文。现在处理命名空间。

答案 1 :(得分:1)

使用CSS查找节点:

require 'nokogiri'

doc = Nokogiri::XML(<<EOT)
<?xml version="1.0"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
  <soap:Body>
    <ns1:getPoliciesResponse xmlns:ns1="http://policy.api.control.r1soft.com/">
      <return>
        <CDPId>XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXx</CDPId>
        <description/>
        <diskSafeID>e8e13555-f577-40d2-99c8-fa8a019d3b55</diskSafeID>
        <enabled>true</enabled>
        <id>7f55f8d6-92a9-4b14-bff4-631559d92259</id>
        <lastReplicationRunTime>2013-06-16T22:00:04.918-05:00</lastReplicationRunTime>
        <name>pstueck-mysql daily</name>
        <nextReplicationRunTime>2013-06-17T22:00:00-05:00</nextReplicationRunTime>
        <replicationScheduleFrequencyType>DAILY</replicationScheduleFrequencyType>
        <state>ALERT</state>
        <warnings>Policy last completed with alerts</warnings>
      </return>
      <return>
        <CDPId>XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXx</CDPId>
        <description/>
        <diskSafeID>bcb68765-a719-4291-912d-2e6af485ea24</diskSafeID>
        <enabled>true</enabled>
        <id>cdb65427-d6f4-4a89-9f77-8763e22dc74b</id>
        <lastReplicationRunTime>2013-06-12T13:29:40.105-05:00</lastReplicationRunTime>
        <name>pstueck-passenger ondemand</name>
        <replicationScheduleFrequencyType>ON_DEMAND</replicationScheduleFrequencyType>
        <state>OK</state>
      </return>
    </ns1:getPoliciesResponse>
  </soap:Body>
</soap:Envelope>
EOT

return_tag = doc.at('return name[text()="pstueck-passenger ondemand"]').parent

puts return_tag.to_xml

哪个输出:

<return>
  <CDPId>XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXx</CDPId>
  <description/>
  <diskSafeID>bcb68765-a719-4291-912d-2e6af485ea24</diskSafeID>
  <enabled>true</enabled>
  <id>cdb65427-d6f4-4a89-9f77-8763e22dc74b</id>
  <lastReplicationRunTime>2013-06-12T13:29:40.105-05:00</lastReplicationRunTime>
  <name>pstueck-passenger ondemand</name>
  <replicationScheduleFrequencyType>ON_DEMAND</replicationScheduleFrequencyType>
  <state>OK</state>
</return>

Nokogiri支持XPath和CSS。我发现CSS更容易阅读。

我使用at方法查找第一个匹配的匹配项,并显示它是第一个匹配项,我交换了两个<return>块的顺序。 atsearch(...).first相同,因此当您在文档at中查找某个内容的第一个实例时,就可以了。

Nokogiri通常足够聪明,可以了解XPath和CSS选择器之间的区别,因此我们可以使用通用atsearch。如果您需要强制CSS或XPath解析,因为选择器是性别非特定的,您可以分别使用特定的cssxpathat_cssat_xpath。它们都记录在Nokogiri::XML::Node文档中。

parent是必要的,因为我们想要所选节点的父节点<name>。我只是猛烈抨击它并备份了一个块。这在XPath中更容易实现,我们可以使用..指向父节点。