在python中附加到XML结构

时间:2014-11-08 11:44:53

标签: python xml

我想将自定义子元素更改/添加到由我的脚本生成的xml中。

顶部元素是AAA:

top = Element('AAA')

gather_lines看起来像这样:

[['TY', ' RPRT'], ['A1', ' Peter'], ['T3', ' Something'], ['ER', ' ']]

然后我逐个枚举所有行并为top创建一个子元素:

for line in enumerate(collected_lines):
 child = SubElement(top, line[0])
 child.text = line[1]

输出:

<?xml version="1.0" ?>
<AAA>
  <TY> RPRT</TY>
  <A1> Peter</A1>
  <T3> Something</T3>
  <ER> </ER>
  <TY> RPRT2</TY>
  <A1> Peter</A1>
  <T3> Something2</T3>
  <ER> </ER>
  <TY> RPRT2</TY>
  <A1> Peter</A1>
  <T3> Something2</T3>
  <ER> </ER>
</AAA>

我想将<ART>元素添加到top元素,然后像这样打印xml:

<?xml version="1.0" ?>
<AAA>
  <ART>
   <TY> RPRT</TY>
   <A1> Peter</A1>
   <T3> Something</T3>
   <ER> </ER>
  </ART>
  <ART>
   <TY> RPRT2</TY>
   <A1> Peter</A1>
   <T3> Something2</T3>
   <ER> </ER>
  </ART>
  <ART>
   <TY> RPRT2</TY>
   <A1> Peter</A1>
   <T3> Something2</T3
  </ART>
</AAA>

我试图用if state态做到这一点。像:

if "TY" in line:
 "append somehow before TY element, <ART>"
if "ER" in line:
 "append somehow after ER element, </ART>"

有一种简单的方法可以解决这个问题吗?

1 个答案:

答案 0 :(得分:1)

只需重新分配top元素并使用insert

top = ET.Element('AAA')
# by the way you need index, element on enumerate
for i, line in enumerate(collected_lines):
    child = ET.SubElement(top, line[0])
    child.text = line[1]

art = top
art.tag = 'ART'
top = ET.Element('AAA')
top.insert(1, art)

ET.tostring(top)
'<AAA><ART><TY> RPRT</TY><A1> Peter</A1><T3> Something</T3><ER> </ER></ART></AAA>'

正如@twasbrillig指出的那样,你甚至不需要enumerate,只需要一个简单的for/loop即可:

...
for line in collected_lines:
    child = ET.SubElement(top, line[0])
    child.text = line[1]
...

另一次更新

OP编辑也问如何处理多个部分,如前例所示,这可以通过普通的Python逻辑实现:

import xml.etree.ElementTree as ET

s = '''<?xml version="1.0" ?>
<AAA>
  <TY> RPRT</TY>
  <A1> Peter</A1>
  <T3> Something</T3>
  <ER> </ER>
  <TY> RPRT2</TY>
  <A1> Peter</A1>
  <T3> Something2</T3>
  <ER> </ER>
  <TY> RPRT2</TY>
  <A1> Peter</A1>
  <T3> Something3</T3>
  <ER> </ER>
</AAA>'''

top = ET.fromstring(s)
# assign a new Element to replace top later on
new_top = ET.Element('AAA')
# get all indexes where TY, ER are at
ty = [i for i,n in enumerate(top) if n.tag == 'TY']
er = [i for i,n in enumerate(top) if n.tag == 'ER']
# top[x:y] will get all the sibling elements between TY, ER (from their indexes)
nodes = [top[x:y] for x,y in zip(ty,er)]

# then loop through each nodes and insert SubElement ART
# and loop through each node and insert into ART
for node in nodes:
    art = ET.SubElement(new_top, 'ART')
    for each in node:
        art.insert(1, each)
# replace top Element by new_top
top = new_top

# you don't need lxml, I just used it to pretty_print the xml    
from lxml import etree
# you can just ET.tostring(top)
print etree.tostring(etree.fromstring(ET.tostring(top)), \
          xml_declaration=True, encoding='utf-8', pretty_print=True)
<?xml version='1.0' encoding='utf-8'?>
<AAA>
  <ART><TY> RPRT</TY>
  <T3> Something</T3>
  <A1> Peter</A1>
  </ART>
  <ART><TY> RPRT2</TY>
  <T3> Something2</T3>
  <A1> Peter</A1>
  </ART>
  <ART><TY> RPRT2</TY>
  <T3> Something3</T3>
  <A1> Peter</A1>
  </ART>
</AAA>