Question

我是python的新手。我想创建一个xml树，其中包含一个父项，几个子项和几个子项。我存储的子标签位于列表'TAG'中，Subchild标签位于列表'SUB'中我提出了以下代码，但我无法达到预期的效果！

def make_xml(tag,sub):
'''
Takes in two lists and Returns a XML object.
The first list has to contain all the tag objects
The Second list has to contain child data's
'''
from xml.etree.ElementTree import Element, SubElement, Comment, tostring
top = Element("Grand Parent")
comment = Comment('This is the ccode parse tree')
top.append(comment)
i=0
try:
    for ee in tag:
        child = SubElement(top, 'Tag'+str(i))
        child.text = str(tag[i]).encode('utf-8',errors = 'ignore')

        subchild = SubElement(child, 'Content'+str(i))
        subchild.text = str(sub[i]).encode('utf-8',errors = 'ignore')

        i = i+1;
except  UnicodeDecodeError:
    print 'oops'
return top

编辑：我有两个这样的列表： TAG = ['HAPPY'，'GO'，'LUCKY'] SUB = ['ED'，'EDD'，'EDDY']

我想要的是：

<G_parent>
    <parent1>
         HAPPY
        <child1>
              ED   
        <\child1>
     <\parent1>
     <parent2>
         GO
        <child2>
              EDD
        <\child2>
    <\parent2>
    <parent3>
         LUCKY
        <child3>
              EDDY
        <\child3
    <\parent3>
<\G_parent>

实际列表中包含的内容比此更多。我想实现使用for循环左右。

EDIT:

OOP的。我的错！当我传递示例列表时，代码按预期工作。但在我的实际应用中，列表很长。该列表包含从pdf文件中提取的文本片段。在该文本的某处，我得到了UnicodeDecodeError（原因：pdf提取的文本凌乱。证明：'oops'被打印一次），并且返回的xml对象不完整。所以我需要弄清楚即使在UnicodeDecodeErrors上我的完整列表也会被解析。那可能吗！我正在使用.decode（'utf-8'，errors ='ignore'），即使这样解析也没有完成！

Answer 1

请参阅this article，尤其是构建XML文档部分。

使用ElementTree创建Xml

1 个答案: