XML子树解析

时间:2014-04-02 12:33:15

标签: python xml

我必须使用lxml甚至xml.etree.ElementTree模块解析XML文件

<?xml version="1.0"?>
<corners>
  <version>1.05</version>
  <process>
    <name>ss649</name>
    <statistics>
      <statistic name="Min" forparameter="modname" isextreme="no" style="tbld">
        <value>0.00073</value>
        <real_value>7.300e-10</real_value>
      </statistic>
      <statistic name="Max" forparameter="modname" isextreme="no" style="tbld">
        <value>0.32420</value>
        <real_value>3.242e-07</real_value>
     </statistic>
     <variant>
          <name>Unit</name>
          <value>
            <value>Size</value>
            <statistics>
              <statistic name="Min" forparameter="modname1" isextreme="no" style="tbld">
                <value>0.02090</value>
                <real_value>2.090e-08</real_value>
              </statistic>
              <statistic name="Max" forparameter="modname2" isextreme="no" style="tbld">
                <value>0.02090</value>
                <real_value>2.090e-08</real_value>
              </statistic>
         </variant>

我必须延伸所有值并制作一个值得的Dict,但是我无法访问子树,我该怎么做?

尝试创建一个看起来像这样的字典

 dict={
      'modname' => { 
        'Min' : 0.00073,
        'Max': 0.32420,
       }
 }

3 个答案:

答案 0 :(得分:2)

xmltodict绝对是你应该考虑使用的东西:

from pprint import pprint
import xmltodict

data = """<?xml version="1.0"?>
<corners>
  <version>1.05</version>
  <process>
    <name>ss649</name>
    <statistics>
      <statistic name="Min" forparameter="modname" isextreme="no" style="tbld">
        <value>0.00073</value>
        <real_value>7.300e-10</real_value>
      </statistic>
      <statistic name="Max" forparameter="modname" isextreme="no" style="tbld">
        <value>0.32420</value>
        <real_value>3.242e-07</real_value>
     </statistic>
    </statistics>
  </process>
</corners>"""

pprint(xmltodict.parse(data))

一行代码,你很高兴。

希望对你有用。

答案 1 :(得分:2)

我使用过xml.etree.ElementTree模块

dict = {}
tree = ET.parse('file.xml')
root=tree.getroot()
for attribute in root:
        for stats in attribute.iter('statistics'):  #Accessing to child tree of the process 'attribute'
            for sub_att in stats.iter('statistic'): #Iterating trough the attribute items
                    name      =  sub_att.get('name')
                    parameter =  sub_att.get('forparameter')
                    for param_value in sub_att.iter('value'):
                         value = param_value.text   #Collecting the value of the sub_attribute
                         break                      #Speed up the script, skips the <real_value>
            if not dict.has_key(parameter):
                    dict[parameter] = {}
            dict[parameter][name] = value

输出:

dict={
      'modname' : { 
        'Min' : 0.00073,
        'Max': 0.32420,
       }
}

答案 2 :(得分:0)

您可能希望看看这个相当不错的ActiveState代码段:

http://code.activestate.com/recipes/410469-xml-as-dictionary/

我通过以下SO帖子看到了这个,也可能有用:

How to convert an xml string to a dictionary in Python?

xmltodict也是一个不错的选择:

https://github.com/martinblech/xmltodict