Python:通过维护层次结构,解析xml以获取作为键的属性值和作为值的标记值

时间:2017-11-29 13:07:08

标签: python json xml parsing lxml

xml以下是来自Web服务和端点的响应。

     <ns2:getModuleAnswersResponse xmlns:ns2="http://www.example.com/ManagerService">
        <ns2:answer>
           <ns2:answer key="storage">
              <ns2:value key="failover">true</ns2:value>
              <ns2:answer key="timeseries">
                 <ns2:answer key="socketconnector">
                    <ns2:value key="host">localhost</ns2:value>
                    <ns2:value key="port">2020</ns2:value>
                 </ns2:answer>
              </ns2:answer>
           </ns2:answer>
           <ns2:answer key="frontendws">
              <ns2:answer key="tomcat">
                 <ns2:value key="host">localhost</ns2:value>
                 <ns2:value key="protocol">http</ns2:value>
                 <ns2:value key="username">user</ns2:value>
                 <ns2:value key="password">abc</ns2:value>
              </ns2:answer>
              <ns2:value key="instance">WS</ns2:value>
           </ns2:answer>
           <ns2:answer key="topologyservice">
              <ns2:value key="host">localhost</ns2:value>
              <ns2:answer key="gateway2">
                 <ns2:value key="host">localhost</ns2:value>
                 <ns2:value key="port">48443</ns2:value>
                 <ns2:value key="authentication">certificate</ns2:value>
              </ns2:answer>
           </ns2:answer>
           <ns2:answers key="connection">
              <ns2:answer>
                 <ns2:answer key="primary">
                    <ns2:answer key="vcenter">
                       <ns2:value key="host">localhost</ns2:value>
                       <ns2:value key="username">admin</ns2:value>
                       <ns2:value key="password">abc</ns2:value>
                    </ns2:answer>
                 </ns2:answer>
              </ns2:answer>
           </ns2:answers>
           <ns2:value key="use_advancedsettings">false</ns2:value>
        </ns2:answer>
     </ns2:getModuleAnswersResponse>

需要在python中解析此XML以提供此格式的响应。

{'storage':
    {'failover': 'true', 'timeseries': 
        {'socketconnector': 
            {'host': 'localhost', 
             'port': '2020'
            }
        }
    }, 
'frontendws': 
    {'tomcat': 
        { 'host': 'localhost', 
          'port': '2020', 
          'username': 'user', 
          'password': 'abc'
        }, 'instance': 'WS'
    }, 
'topologyservice': 
    {'host': 'localhost', 
     'gateway2': 
        {'host': 'localhost', 
         'username': 'admin', 
         'password': 'abc'
        }
    }, 
'connection': 
    {'primary': 
        {'vcenter': 
            {'host': 'localhost', 
             'username': 'admin', 
             'password': 'abc'
             }
        }
    },
'use_advancedsettings': 'false'
}

这是表示XML的旧方法。我在python中使用lxml迭代尝试了不同的递归方法,但未达到正确的结果。寻找python解决方案

1 个答案:

答案 0 :(得分:0)

能够找到给定问题的递归解决方案。这优先考虑属性和值,并对标签执行DFS。

import xml.etree.ElementTree as ET

def func(element):
   my_json = {}

   '''Base Condition'''
   if len(element.getchildren()) == 0:
       json = {}

       if element.attrib:
           if element.text:
               json.update({element.attrib['key']: element.text})
           else:
               json.update({element.attrib['key']: {}})

       return json

   for child in element.getchildren():
       i = 0
       if element.attrib:
           if element.attrib['key'] in my_json:
               my_json[element.attrib['key']].update(func(child))
           else:
               my_json[element.attrib['key']] = func(child)
       else:

           if i in my_json:
               my_json[i].update(func(child))
           else:
               my_json[i] = func(child)

           i += 1

   return my_json

root = ET.parse('test.xml')
print(func(root.getroot()))