获取XML子元素值并在xml解析期间附加到父元素

时间:2017-12-10 18:10:07

标签: python xml

我需要将子数据附加到层次结构中的父级。该术语可以将数据展平为位于层次结尾的一个父级别。我不想硬编码其子项需要展平的父名称。 xml层次结构是一致的。总结以下示例的要求是,Properties元素的子元素应该包含来自子子元素的数据(例如'元素'),如果存在的话。如果有多个子项需要在逗号分隔的父项中显示它们

样本数据

<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xmlns:georss="http://www.georss.org/georss" xmlns:gml="http://www.opengis.net/gml" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata" xml:base="https://aciww.sharepoint.com/sites/AIIMOpwa/DASHBOARD%20DATA%20TEST/_api/">
   <id>440621ab-ccb9-4dd8-9a26-75fcb74dbf07</id>
   <title />
   <updated>2017-12-08T14:22:48Z</updated>
   <entry m:etag="&quot;1&quot;">
      <id>b3e30263-cead-40ef-8d2c-c790b433bef7</id>
      <title />
      <updated>2017-12-08T14:22:48Z</updated>
      <author>
         <name />
      </author>
      <content type="application/xml">
         <m:properties>
            <d:Workstream_x0020_Lead_x0028_s_x0Id m:type="Collection(Edm.Int32)">
               <d:element>18</d:element>
           </d:Workstream_x0020_Lead_x0028_s_x0Id>
            <d:Workstream_x0020_Lead_x0028_s_x0StringId m:type="Collection(Edm.String)">
               <d:element>18</d:element>
            </d:Workstream_x0020_Lead_x0028_s_x0StringId>
            <d:Functional_x0020_Program_x0020_LId m:type="Collection(Edm.Int32)">
               <d:element>18</d:element>
            </d:Functional_x0020_Program_x0020_LId>
            <d:Functional_x0020_Program_x0020_LStringId m:type="Collection(Edm.String)">
               <d:element>18</d:element>
               <d:element>333</d:element>
            </d:Functional_x0020_Program_x0020_LStringId>
         </m:properties>
      </content>
   </entry>
</feed>

请找到以串行格式返回所有元素的代码,我的意思是父母,然后是孩子。

from xml.etree import ElementTree
import xml.etree.ElementTree as etree


tree = etree.parse('CR2.xml') # Please change the location to xml file

def removeNS(tag) :
    if tag.find('}') == -1 :
        return tag
    else:
        return tag.split('}', 1)[1]


for child in tree.iter():
       if (str(child).find('entry')>0):
           for child1 in child.iter():
                  if (str(child1).find('content')>0):
                    for child2 in child1.iter():
                        if(str(child2).find('properties')>0):
                            for child3 in child2.iter():
                                     print {removeNS(child3.tag):child3.text}

预期结果

{'properties': '\n            '}
{'Workstream_x0020_Lead_x0028_s_x0Id': '18'}
{'Workstream_x0020_Lead_x0028_s_x0StringId': '18'}
{'Functional_x0020_Program_x0020_LId': '18'}
{'Functional_x0020_Program_x0020_LStringId': '18,333'}

我得到的实际结果如下所示

{'properties': '\n            '}
{'Workstream_x0020_Lead_x0028_s_x0Id': None}
{'element': '18'}
{'Workstream_x0020_Lead_x0028_s_x0StringId': '\n               '}
{'element': '18'}
{'Functional_x0020_Program_x0020_LId': '\n               '}
{'element': '18'}
{'Functional_x0020_Program_x0020_LStringId': '\n               '}
{'element': '18'}
{'element': '333'}

0 个答案:

没有答案