python中cElementTree的xml解析错误

时间:2015-01-23 22:17:10

标签: python xml parsing xml-parsing

我一直用cElementTree编写XML文件,当我去使用.parse(file)时,我收到一条错误,上面写着:

xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 15

XML文件:

<material Date Created="1/23/2015 at 14:59:10 in Mountain Standard Time" Material Name="Material" Render Engine="CYCLES">
    <main>
        <node0 inputs="" label="" location="<Vector (-114.1876, 479.6438)>" name="Texture Coordinate" node_specific="['from_dupli', False]" outputs="" type="TEX_COORD" />
        <node0 inputs="" label="" location="<Vector (87.1538, 383.3991)>" name="Attribute" node_specific="['attribute_name', '']" outputs="" type="ATTRIBUTE" />
        <node0 inputs="" label="" location="<Vector (-38.2097, 246.6303)>" name="RGB" node_specific="" outputs="[0, (0.5, 0.5, 0.5, 1.0)]" type="RGB" />
    </main>
</material>

我不明白为什么它无法解析它创建的文件。

1 个答案:

答案 0 :(得分:2)

您正在尝试解析有效XML的文档。属性名称中不能包含空格,解析器需要=而不是更多属性名称:

<material Date Created="1/23/2015 at 14:59:10 in Mountain Standard Time"
<!--          ^ position 15 on line 1 -->

<属性值中的>location字符也应分别转义为&lt;&gt;

如果您替换了material标记上的属性名称中的空格并转义了这些尖括号,则可以解析该文档:

>>> from xml.etree import ElementTree
>>> sample = '''\
... <material Date_Created="1/23/2015 at 14:59:10 in Mountain Standard Time" Material_Name="Material" Render_Engine="CYCLES">
...     <main>
...         <node0 inputs="" label="" location="&lt;Vector (-114.1876, 479.6438)&gt;" name="Texture Coordinate" node_specific="['from_dupli', False]" outputs="" type="TEX_COORD" />
...         <node0 inputs="" label="" location="&lt;Vector (87.1538, 383.3991)&gt;" name="Attribute" node_specific="['attribute_name', '']" outputs="" type="ATTRIBUTE" />
...         <node0 inputs="" label="" location="&lt;Vector (-38.2097, 246.6303)&gt;" name="RGB" node_specific="" outputs="[0, (0.5, 0.5, 0.5, 1.0)]" type="RGB" />
...     </main>
... </material>
... '''
>>> tree = ElementTree.fromstring(sample)
>>> tree
<Element 'material' at 0x1042d42d0>