非拉丁文答案扩展

Question

每当我致电ElementTree.tostring(e)时，都会收到以下错误消息：

AttributeError: 'Element' object has no attribute 'getroot'

有没有其他方法可以将ElementTree对象转换为XML字符串？

回溯：

Traceback (most recent call last):
  File "Development/Python/REObjectSort/REObjectResolver.py", line 145, in <module>
    cm = integrateDataWithCsv(cm, csvm)
  File "Development/Python/REObjectSort/REObjectResolver.py", line 137, in integrateDataWithCsv
    xmlstr = ElementTree.tostring(et.getroot(),encoding='utf8',method='xml')
AttributeError: 'Element' object has no attribute 'getroot'

Answer 1

Element个对象没有.getroot()方法。放弃该通话，.tostring()通话有效：

xmlstr = ElementTree.tostring(et, encoding='utf8', method='xml')

Answer 2

如何将`ElementTree.Element`转换为字符串？

对于Python 3：

xml_str = ElementTree.tostring(xml, encoding='unicode')

对于Python 2：

xml_str = ElementTree.tostring(xml, encoding='utf-8')

兼容Python 2＆amp; 3：

xml_str = ElementTree.tostring(xml).decode()

使用示例

from xml.etree import ElementTree

xml = ElementTree.Element("Person", Name="John")
xml_str = ElementTree.tostring(xml).decode()
print(xml_str)

输出：

<Person Name="John" />

解释

尽管名称暗示，ElementTree.tostring()默认情况下不返回字符串。默认行为是generate a bytestring。虽然这在Python 2中不是问题，但在Python 3中这两种类型更加明显。

在Python 2中，您可以将str类型用于文本和二进制数据。   不幸的是，这两种不同概念的融合可能导致   脆弱的代码有时适用于任何一种数据   不。 [...]

为了使文本和二进制数据之间的区别更加清晰和明显， [Python 3]使文本和二进制数据不同的类型不能盲目地混合在一起。

^{来源：Porting Python 2 Code to Python 3}

我们可以使用decode()将字节字符串显式转换为常规文本来解决这种歧义。这确保了与Python 2和Python 3的兼容性。

对于Python 2＆amp; 3兼容性：ElementTree.tostring(xml).decode()
对于Python 3兼容性：ElementTree.tostring(xml, encoding='unicode')

作为参考，我已经包含了Python 2和Python 3之间.tostring()结果的比较。

ElementTree.tostring(xml).decode()
# Python 3: <Person Name="John" />
# Python 2: <Person Name="John" />

ElementTree.tostring(xml, encoding='unicode')
# Python 3: <Person Name="John" />
# Python 2: LookupError: unknown encoding: unicode

ElementTree.tostring(xml, encoding='utf-8')
# Python 3: b'<Person Name="John" />'
# Python 2: <Person Name="John" />

ElementTree.tostring(xml, encoding='utf8')
# Python 3: b'<?xml version=\'1.0\' encoding=\'utf8\'?>\n<Person Name="John" />'
# Python 2: <?xml version='1.0' encoding='utf8'?>
#           <Person Name="John" />

感谢Martijn Peters指出在Python 2和3之间更改了str数据类型。

为什么不使用str（）？

在大多数情况下，使用str()将“cannonical”方式将对象转换为字符串。不幸的是，在Element中使用它会将对象在内存中的位置作为十六进制字符串返回，而不是对象数据的字符串表示形式。

from xml.etree import ElementTree

xml = ElementTree.Element("Person", Name="John")
print(str(xml))  # <Element 'Person' at 0x00497A80>

Answer 3

非拉丁文答案扩展

扩展到@Stevoisiak's answer并处理非拉丁字符。只有一种方法可以向您显示非拉丁字符。一个方法在Python 3和Python 2上都不同。

输入

xml = ElementTree.fromstring('<Person Name="크리스" />')
xml = ElementTree.Element("Person", Name="크리스")  # Read Note about Python 2

注意：在Python 2中，调用toString(...)代码时，将xml分配给ElementTree.Element("Person", Name="크리스")会引发错误...

UnicodeDecodeError: 'ascii' codec can't decode byte 0xed in position 0: ordinal not in range(128)

输出

ElementTree.tostring(xml)
# Python 3 (크리스): b'<Person Name="&#53356;&#47532;&#49828;" />'
# Python 3 (John): b'<Person Name="John" />'

# Python 2 (크리스): <Person Name="&#53356;&#47532;&#49828;" />
# Python 2 (John): <Person Name="John" />


ElementTree.tostring(xml, encoding='unicode')
# Python 3 (크리스): <Person Name="크리스" />             <-------- Python 3
# Python 3 (John): <Person Name="John" />

# Python 2 (크리스): LookupError: unknown encoding: unicode
# Python 2 (John): LookupError: unknown encoding: unicode

ElementTree.tostring(xml, encoding='utf-8')
# Python 3 (크리스): b'<Person Name="\xed\x81\xac\xeb\xa6\xac\xec\x8a\xa4" />'
# Python 3 (John): b'<Person Name="John" />'

# Python 2 (크리스): <Person Name="크리스" />             <-------- Python 2
# Python 2 (John): <Person Name="John" />

ElementTree.tostring(xml).decode()
# Python 3 (크리스): <Person Name="&#53356;&#47532;&#49828;" />
# Python 3 (John): <Person Name="John" />

# Python 2 (크리스): <Person Name="&#53356;&#47532;&#49828;" />
# Python 2 (John): <Person Name="John" />

Answer 4

如果您只是需要它来调试以查看 XML 的外观，那么您可以像这样使用 print(xml.etree.ElementTree.tostring(e)) 代替 dump：

xml.etree.ElementTree.dump(e)

这对于作为 Element 的 ElementTree 和 e 对象都有效，因此应该不需要 getroot。

documentation of dump 说：

<块引用>

xml.etree.ElementTree.dump(elem)

将元素树或元素结构写入 sys.stdout。此函数仅用于调试。

确切的输出格式取决于实现。在这个版本中，它被写成一个普通的 XML 文件。

elem 是元素树或单个元素。

在 3.8 版中更改：dump() 函数现在保留用户指定的属性顺序。

将Python ElementTree转换为字符串

4 个答案:

如何将`ElementTree.Element`转换为字符串？

使用示例

解释

为什么不使用str（）？

非拉丁文答案扩展

将Python ElementTree转换为字符串

4 个答案:

如何将ElementTree.Element转换为字符串？

使用示例

解释

为什么不使用str（）？

非拉丁文答案扩展

如何将`ElementTree.Element`转换为字符串？