我有一个XML文件,如下所示:
<Main>
<Stuff author="Jojo" name="Thing 1">
<Attr name="annotation" value="Short description" />
<Attr name="description" value="Long description" />
<Attr name="version" value="1.0.0" />
<Attr name="software" value="Misrocoft Ociffe" />
<Attr name="language" value="Python" />
<Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
<Attr name="command" value="doSomething()" />
</Stuff>
<Stuff author="Toto" name="Thing 2">
<Attr name="annotation" value="Short description" />
<Attr name="description" value="Long description"/>
<Attr name="version" value="4.3.9" />
<Attr name="software" value="Tophoshop" />
<Attr name="language" value="Python" />
<Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
<Attr name="command" value="doSomething()" />
</Stuff>
</Main>
我更新它然后重写它但问题是,如果我用prettyxml
重写它,我会在旧行之间找到新的空格,如下所示:
<Main>
<Stuff author="Jojo" name="Thing 1">
<Attr name="annotation" value="Short description" />
<Attr name="description" value="Long description" />
<Attr name="version" value="1.0.0" />
<Attr name="software" value="Misrocoft Ociffe" />
<Attr name="language" value="Python" />
<Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
<Attr name="command" value="doSomething()" />
</Stuff>
<Stuff author="Toto" name="Thing 2">
<Attr name="annotation" value="Short description" />
<Attr name="description" value="Long description"/>
<Attr name="version" value="4.3.9" />
<Attr name="software" value="Tophoshop" />
<Attr name="language" value="Python" />
<Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
<Attr name="command" value="doSomething()" />
</Stuff>
<Stuff author="Titi" name="New thing">
<Attr name="annotation" value="Short description" />
<Attr name="description" value="Long description"/>
<Attr name="version" value="4.3.9" />
<Attr name="software" value="Tophoshop" />
<Attr name="language" value="Python" />
<Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
<Attr name="command" value="doSomething()" />
</Stuff>
</Main>
如果我改写它toxml
我根本就没有缩进或空格:
<Main>
<Stuff author="Jojo" name="Thing 1">
<Attr name="annotation" value="Short description" />
<Attr name="description" value="Long description" />
<Attr name="version" value="1.0.0" />
<Attr name="software" value="Misrocoft Ociffe" />
<Attr name="language" value="Python" />
<Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
<Attr name="command" value="doSomething()" />
</Stuff>
<Stuff author="Toto" name="Thing 2">
<Attr name="annotation" value="Short description" />
<Attr name="description" value="Long description"/>
<Attr name="version" value="4.3.9" />
<Attr name="software" value="Tophoshop" />
<Attr name="language" value="Python" />
<Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
<Attr name="command" value="doSomething()" />
</Stuff>
<Stuff author="Titi" name="New thing"><Attr name="annotation" value="Short description" /><Attr name="description" value="Long description"/><Attr name="version" value="4.3.9" /><Attr name="software" value="Tophoshop" /><Attr name="language" value="Python" /><Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" /><Attr name="command" value="doSomething()" /></Stuff></Main>
有没有办法输出一个新的漂亮的XML,它不会修改文件的现有格式?
我正在考虑将XML更改为单行字符串,然后在prettyxml中重写它,但我不知道该怎么做或者是否可能(我使用etree
和minidom
获取信息)。
以下是我最终制作的代码,请注意我的rootXml
来自ElementTree。
from xml.dom import minidom
import xml.etree.ElementTree as ET
def writeXml(rootXml, xmlFile):
roughString = ET.tostring(rootXml, 'utf-8')
oneLineString = ''.join([s.strip() for s in roughString.splitlines()])
minidomXml = minidom.parseString(oneLineString)
rootMinidom = minidomXml.firstChild
prettyXmlString = rootMinidom.toprettyxml()
prettyXml = ET.fromstring(prettyXmlString)
with open(xmlFile, "w") as f:
f.write (ET.tostring(prettyXml))
将返回以下xml:
<Main>
<Stuff author="Jojo" name="Thing 1">
<Attr name="annotation" value="Short description" />
<Attr name="description" value="Long description" />
<Attr name="version" value="1.0.0" />
<Attr name="software" value="Misrocoft Ociffe" />
<Attr name="language" value="Python" />
<Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
<Attr name="command" value="doSomething()" />
</Stuff>
<Stuff author="Toto" name="Thing 2">
<Attr name="annotation" value="Short description" />
<Attr name="description" value="Long description"/>
<Attr name="version" value="4.3.9" />
<Attr name="software" value="Tophoshop" />
<Attr name="language" value="Python" />
<Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
<Attr name="command" value="doSomething()" />
</Stuff>
<Stuff author="Titi" name="New thing">
<Attr name="annotation" value="Short description" />
<Attr name="description" value="Long description"/>
<Attr name="version" value="4.3.9" />
<Attr name="software" value="Tophoshop" />
<Attr name="language" value="Python" />
<Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
<Attr name="command" value="doSomething()" />
</Stuff>
</Main>
答案 0 :(得分:1)
据我所知,没有干净的方法来修复minidom
toprettyxml()
*。最简单的方法之一可能是使用BeautifulSoup
&#39; prettify()
。例如,您的单行Stuff
元素已按prettify()
正确分隔为包含缩进的新行:
>>> from bs4 import BeautifulSoup
>>> raw = '''<Stuff author="Titi" name="New thing"><Attr name="annotation" value="Short description" /><Attr name="description" value="Long description"/><Attr name="version" value="4.3.9" /><Attr name="software" value="Tophoshop" /><Attr name="language" value="Python" /><Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" /><Attr name="command" value="doSomething()" /></Stuff>'''
>>> soup = BeautifulSoup(raw, "xml")
>>> print soup.prettify()
<?xml version="1.0" encoding="utf-8"?>
<Stuff author="Titi" name="New thing">
<Attr name="annotation" value="Short description"/>
<Attr name="description" value="Long description"/>
<Attr name="version" value="4.3.9"/>
<Attr name="software" value="Tophoshop"/>
<Attr name="language" value="Python"/>
<Attr name="path" value="/here/there/aroundHere/somewhere/file.ext"/>
<Attr name="command" value="doSomething()"/>
</Stuff>
*)参考文献: