我使用ElementTree
在Python中生成XML文档,但tostring
函数在转换为纯文本时不包含XML declaration。
from xml.etree.ElementTree import Element, tostring
document = Element('outer')
node = SubElement(document, 'inner')
node.NewValue = 1
print tostring(document) # Outputs "<outer><inner /></outer>"
我需要我的字符串包含以下XML声明:
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
但是,似乎没有任何记录的方法可以做到这一点。
是否有适当的方法在ElementTree
?
答案 0 :(得分:73)
我很惊讶地发现ElementTree.tostring()
似乎没有办法。但是,您可以使用ElementTree.ElementTree.write()
将XML文档写入假文件:
from io import BytesIO
from xml.etree import ElementTree as ET
document = ET.Element('outer')
node = ET.SubElement(document, 'inner')
et = ET.ElementTree(document)
f = BytesIO()
et.write(f, encoding='utf-8', xml_declaration=True)
print(f.getvalue()) # your XML file, encoded as UTF-8
见this question。即使这样,我也不认为你可以在没有自己编写前缀的情况下获得“独立”属性。
答案 1 :(得分:20)
我会使用lxml(参见http://lxml.de/api.html)。
然后你可以:
from lxml import etree
document = etree.Element('outer')
node = etree.SubElement(document, 'inner')
print(etree.tostring(document, xml_declaration=True))
答案 2 :(得分:13)
If you include the encoding='utf8'
, you will get an XML header:
xml.etree.ElementTree.tostring使用encoding =&#39; utf8&#39;
写入XML编码声明
示例Python代码(适用于Python 2和3):
import xml.etree.ElementTree as ElementTree
tree = ElementTree.ElementTree(
ElementTree.fromstring('<xml><test>123</test></xml>')
)
root = tree.getroot()
print('without:')
print(ElementTree.tostring(root, method='xml'))
print('')
print('with:')
print(ElementTree.tostring(root, encoding='utf8', method='xml'))
Python 2输出:
$ python2 example.py
without:
<xml><test>123</test></xml>
with:
<?xml version='1.0' encoding='utf8'?>
<xml><test>123</test></xml>
使用Python 3,你会注意到the b
prefix表示返回字节文字(就像Python 2一样):
$ python3 example.py
without:
b'<xml><test>123</test></xml>'
with:
b"<?xml version='1.0' encoding='utf8'?>\n<xml><test>123</test></xml>"
答案 3 :(得分:3)
我最近遇到这个问题,经过一些挖掘代码后,我发现以下代码片段是函数ElementTree.write
的定义
def write(self, file, encoding="us-ascii"):
assert self._root is not None
if not hasattr(file, "write"):
file = open(file, "wb")
if not encoding:
encoding = "us-ascii"
elif encoding != "utf-8" and encoding != "us-ascii":
file.write("<?xml version='1.0' encoding='%s'?>\n" %
encoding)
self._write(file, self._root, encoding, {})
所以答案是,如果您需要将XML标头写入您的文件,请设置除encoding
或utf-8
以外的us-ascii
参数,例如UTF-8
答案 4 :(得分:2)
使用ElementTree
软件包的最小工作示例:
import xml.etree.ElementTree as ET
document = ET.Element('outer')
node = ET.SubElement(document, 'inner')
node.text = '1'
res = ET.tostring(document, encoding='utf8', method='xml').decode()
print(res)
输出为:
<?xml version='1.0' encoding='utf8'?>
<outer><inner>1</inner></outer>
答案 5 :(得分:1)
简单
Python 2和3的示例( encoding 参数必须为 utf8 ):
import xml.etree.ElementTree as ElementTree
tree = ElementTree.ElementTree(ElementTree.fromstring('<xml><test>123</test></xml>'))
root = tree.getroot()
print(ElementTree.tostring(root, encoding='utf8', method='xml'))
在Python 3.8中,该素材有 xml_declaration 参数:
版本3.8中的新增功能:xml_declaration和default_namespace 参数。
xml.etree.ElementTree.tostring(element,encoding =“ us-ascii”, method =“ xml”,*,xml_declaration = None,default_namespace = None, short_empty_elements = True)生成XML的字符串表示形式 元素,包括所有子元素。 element是一个Element实例。 encoding 1是输出编码(默认为US-ASCII)。采用 encoding =“ unicode”以生成Unicode字符串(否则, 字节串)。方法是“ xml”,“ html”或“ text” (默认值为“ xml”)。 xml_declaration,default_namespace和 short_empty_elements具有与ElementTree.write()相同的含义。 返回包含XML数据的(可选)编码的字符串。
Python 3.8及更高版本的示例:
import xml.etree.ElementTree as ElementTree
tree = ElementTree.ElementTree(ElementTree.fromstring('<xml><test>123</test></xml>'))
root = tree.getroot()
print(ElementTree.tostring(root, encoding='unicode', method='xml', xml_declaration=True))
答案 6 :(得分:0)
我会使用ET:
try:
from lxml import etree
print("running with lxml.etree")
except ImportError:
try:
# Python 2.5
import xml.etree.cElementTree as etree
print("running with cElementTree on Python 2.5+")
except ImportError:
try:
# Python 2.5
import xml.etree.ElementTree as etree
print("running with ElementTree on Python 2.5+")
except ImportError:
try:
# normal cElementTree install
import cElementTree as etree
print("running with cElementTree")
except ImportError:
try:
# normal ElementTree install
import elementtree.ElementTree as etree
print("running with ElementTree")
except ImportError:
print("Failed to import ElementTree from any known place")
document = etree.Element('outer')
node = etree.SubElement(document, 'inner')
print(etree.tostring(document, encoding='UTF-8', xml_declaration=True))
答案 7 :(得分:0)
如果你只想打印,这是有效的。我尝试将其发送到文件时出错...
import xml.dom.minidom as minidom
import xml.etree.ElementTree as ET
from xml.etree.ElementTree import Element, SubElement, Comment, tostring
def prettify(elem):
rough_string = ET.tostring(elem, 'utf-8')
reparsed = minidom.parseString(rough_string)
return reparsed.toprettyxml(indent=" ")
答案 8 :(得分:0)
在文档中没有找到添加standalone
参数的替代方法,因此我修改了ET.tosting
函数以将其作为参数。
from xml.etree import ElementTree as ET
# Sample
document = ET.Element('outer')
node = ET.SubElement(document, 'inner')
et = ET.ElementTree(document)
# Function that you need
def tostring(element, declaration, encoding=None, method=None,):
class dummy:
pass
data = []
data.append(declaration+"\n")
file = dummy()
file.write = data.append
ET.ElementTree(element).write(file, encoding, method=method)
return "".join(data)
# Working example
xdec = """<?xml version="1.0" encoding="UTF-8" standalone="no" ?>"""
xml = tostring(document, encoding='utf-8', declaration=xdec)
答案 9 :(得分:0)
另一个非常简单的选项是将所需的标头连接到xml的字符串中,如下所示:
xml = (bytes('<?xml version="1.0" encoding="UTF-8"?>\n', encoding='utf-8') + ET.tostring(root))
xml = xml.decode('utf-8')
with open('invoice.xml', 'w+') as f:
f.write(xml)
答案 10 :(得分:0)
是否存在用于在ElementTree中呈现XML声明的适当方法?
是的,不需要使用.tostring
函数。根据{{3}},您应该创建ElementTree对象,创建Element和SubElements,设置树的根,最后在{ {1}}函数,因此声明行包含在输出文件中。
您可以这样操作:
xml_declaration
输出文件为:
.write