将实体引用写入属性值

时间:2016-11-16 15:34:03

标签: python xml lxml entityreference

我现在使用lxml模块通过Python生成XML文件。

我们必须定义一些要在我们的外部系统中解析的实体引用。 通常,所有元素的文本字符串都会在输出时转义为XML字符串:

from lxml import etree
root = etree.Element("root")
sub = etree.Element("sub")
sub.text = "&entity;text"
root.append(sub)
print etree.tostring(root)
'<root><sub>&amp;entity;text</sub></root>' # I want to get without escaping

我发现lxml.etree.Entity类对此有用。:

root = etree.Element("root")
sub = etree.Element("sub")
entity = etree.Entity("entity")
entity.tail = "text"
sub.append(entity)
root.append(sub)
print etree.tostring(root)
'<root><sub>&entity;text</sub></root>'

但是,如果我们使用实体引用将属性值设置为text,则会失败:

root = etree.Element("root")
sub = etree.Element("sub")
entity = etree.Entity("entity")
entity.tail = "text"
sub.attrib["foo"] = entity

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-52-62cb8ef3a9a6> in <module>()
----> 1 sub.attrib["foo"] = entity

lxml.etree.pyx in lxml.etree._Attrib.__setitem__ (src/lxml/lxml.etree.c:58775)()

apihelpers.pxi in lxml.etree._setAttributeValue (src/lxml/lxml.etree.c:19025)()

apihelpers.pxi in lxml.etree._utf8 (src/lxml/lxml.etree.c:26460)()

TypeError: Argument must be bytes or unicode, got '_Entity'

我想得到的是:

<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE foo [
  <!ENTITY ent "entity" >
  <!ENTITY aaa "aaaaaa" >
]>
<foo>
  <sub bar="&ent;bas">&aaa;bbb</sub>
<foo>

我们如何为此定义生成器?

0 个答案:

没有答案