在Python 3中解析和格式化元素

时间:2014-11-30 16:16:38

标签: xml python-3.x lxml python-3.4

我有一些带有元素的xml:

<seg id="1" text="some text"/>

我想在python3中重新格式化为:

<in_seg id="sent1"> some text</in_seg>

我该怎么做?

1 个答案:

答案 0 :(得分:1)

您可以通过实例化Element类来创建元素:

from lxml.etree import fromstring, Element, tostring

data = """
<seg id="1" text="some text"/>
"""
element = fromstring(data)

tag_name = 'in_' + element.tag
tag_id = 'sent' + element.attrib['id']
tag_text = element.attrib['text']

new_element = Element(tag_name, attrib={'id': tag_id})
new_element.text = tag_text
print(tostring(new_element))

打印:

<in_seg id="sent1">some text</in_seg>