我现在使用lxml
模块通过Python生成XML文件。
我们必须定义一些要在我们的外部系统中解析的实体引用。 通常,所有元素的文本字符串都会在输出时转义为XML字符串:
from lxml import etree
root = etree.Element("root")
sub = etree.Element("sub")
sub.text = "&entity;text"
root.append(sub)
print etree.tostring(root)
'<root><sub>&entity;text</sub></root>' # I want to get without escaping
我发现lxml.etree.Entity
类对此有用。:
root = etree.Element("root")
sub = etree.Element("sub")
entity = etree.Entity("entity")
entity.tail = "text"
sub.append(entity)
root.append(sub)
print etree.tostring(root)
'<root><sub>&entity;text</sub></root>'
但是,如果我们使用实体引用将属性值设置为text,则会失败:
root = etree.Element("root")
sub = etree.Element("sub")
entity = etree.Entity("entity")
entity.tail = "text"
sub.attrib["foo"] = entity
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-52-62cb8ef3a9a6> in <module>()
----> 1 sub.attrib["foo"] = entity
lxml.etree.pyx in lxml.etree._Attrib.__setitem__ (src/lxml/lxml.etree.c:58775)()
apihelpers.pxi in lxml.etree._setAttributeValue (src/lxml/lxml.etree.c:19025)()
apihelpers.pxi in lxml.etree._utf8 (src/lxml/lxml.etree.c:26460)()
TypeError: Argument must be bytes or unicode, got '_Entity'
我想得到的是:
<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE foo [
<!ENTITY ent "entity" >
<!ENTITY aaa "aaaaaa" >
]>
<foo>
<sub bar="&ent;bas">&aaa;bbb</sub>
<foo>
我们如何为此定义生成器?