我可以使用ElementTree添加XML节点,但是当我以文本格式打开xml文件时,这会将输出返回到一行而不是树结构。我也尝试使用minidom.toprettyxml,但我不知道如何将输出添加到原始XML。由于我希望脚本在其他环境中可以重现,我更喜欢不使用外部库,如lxml。有人可以帮助我如何打印输出? - python 2.7
示例XML。这是它在文本格式和资源管理器中的外观。
<?xml version="1.0" encoding="utf-8"?>
<default_locators >
<locator_ref>
<name>cherry</name>
<display_name>cherrycherry</display_name>
<workspace_properties>
<factory_progid>Workspace</factory_progid>
<path>InstallDir</path>
</workspace_properties>
</locator_ref>
</default_locators>
文本格式和资源管理器中的预期输出。
<?xml version="1.0" encoding="utf-8"?>
<default_locators >
<locator_ref>
<name>cherry</name>
<display_name>cherrycherry</display_name>
<workspace_properties>
<factory_progid>Workspace</factory_progid>
<path>InstallDir</path>
</workspace_properties>
</locator_ref>
<locator_ref>
<name>berry</name>
<display_name>berryberry</display_name>
<workspace_properties>
<factory_progid>Workspace</factory_progid>
<path>C:\temp\temp</path>
</workspace_properties>
</locator_ref>
</default_locators>
我的剧本
#coding: cp932
import xml.etree.ElementTree as ET
tree = ET.parse(r"C:\DefaultLocators.xml")
root = tree.getroot()
locator_ref = ET.SubElement(root, "locator_ref")
name = ET.SubElement(locator_ref, "name")
name.text = " berry"
display_name = ET.SubElement(locator_ref, "display_name")
display_name.text = "berryberry"
workspace_properties = ET.SubElement(locator_ref, "workspace_properties")
factory_progid = ET.SubElement(workspace_properties,"factory_progid")
factory_progid.text = "Workspace"
path = ET.SubElement(workspace_properties, "path")
path.text = r"c:\temp\temp"
tree.write(r"C:\DefaultLocators.xml", encoding='utf-8')
退回产出。运行我的脚本后,新节点将添加到我的sample.xml文件中,但它会在一行中返回输出,并从原始sample.xml文件中删除所有换行符和缩进。至少这就是我以文本格式打开sample.xml文件时的样子。但是,当我在资源管理器中打开sample.xml文件时,它看起来很好。我仍然看到之前的新行和缩进。即使在运行脚本后,如何以文本格式保留原始树结构?
<default_locators>
<locator_ref>
<name>cherry</name>
<display_name>cherrycherry</display_name>
<workspace_properties>
<factory_progid>Workspace</factory_progid>
<path>InstallDir</path>
</workspace_properties>
</locator_ref>
<locator_ref><name> berry</name><display_name>berryberry</display_name><workspace_properties><factory_progid>Workspace</factory_progid><path>c:\temp\temp</path></workspace_properties></locator_ref></default_locators>
答案 0 :(得分:1)
处理元素时,你可以这样做:element.tail = '\n'
然后,它将以单行写入。
答案 1 :(得分:0)
将您的xml编写为elementTree:
import xml.etree.ElementTree as ET
def serialize_xml(write, elem, encoding, qnames, namespaces):
tag = elem.tag
text = elem.text
if tag is ET.Comment:
write("<!--%s-->" % _encode(text, encoding))
elif tag is ET.ProcessingInstruction:
write("<?%s?>" % _encode(text, encoding))
else:
tag = qnames[tag]
if tag is None:
if text:
write(_escape_cdata(text, encoding))
for e in elem:
serialize_xml(write, e, encoding, qnames, None)
else:
write("\n<" + tag) ## '\n' added by namit
items = elem.items()
if items or namespaces:
if namespaces:
for v, k in sorted(namespaces.items(),
key=lambda x: x[1]): # sort on prefix
if k:
k = ":" + k
write(" xmlns%s=\"%s\"" % (
k.encode(encoding),
_escape_attrib(v, encoding)
))
for k, v in sorted(items): # lexical order
if isinstance(k, QName):
k = k.text
if isinstance(v, QName):
v = qnames[v.text]
else:
v = _escape_attrib(v, encoding)
write(" %s=\"%s\"" % (qnames[k], v))
if text or len(elem):
write(">")
if text:
write(ET._escape_cdata(text, encoding))
for e in elem:
serialize_xml(write, e, encoding, qnames, None)
write("</" + tag + ">")
else:
write(" />")
if elem.tail:
write(ET._escape_cdata(elem.tail, encoding))
ET._serialize_xml=serialize_xml
tree = ET.parse(r"samplexml.xml")
root = tree.getroot()
locator_ref = ET.SubElement(root, "locator_ref")
name = ET.SubElement(locator_ref, "name")
name.text = " berry"
display_name = ET.SubElement(locator_ref, "display_name")
display_name.text = "berryberry"
workspace_properties = ET.SubElement(locator_ref, "workspace_properties")
factory_progid = ET.SubElement(workspace_properties,"factory_progid")
factory_progid.text = "WorkspaceFactory"
path = ET.SubElement(workspace_properties, "path")
ins_out=open("samplexml_1.xml",'wb',1000)
ET.ElementTree(locator_ref).write(ins_out,encoding="ASCII")
ins_out.close()
这将以单行写入完整的文件;不在xml尾部添加空格。
答案 2 :(得分:0)
我认为你必须尝试lxml library。这是在Python中解析XML的最佳方法。 对于这样的事情,它有魔法论证* pretty_print *。 这是一个例子:
import lxml.etree as etree
root = etree.Element("root")
for rn in range(10):
etree.SubElement(root, "column_%s" % str(rn)).text = str(rn*rn)
pretty_data = etree.tostring(root, pretty_print=True, encoding = 'utf-8')
print final_data