如何使用python解析包含'&,<>,〜'等符号的xml文件

时间:2016-07-20 04:49:40

标签: python xml

我在阅读以下xml文件时获得TypeError: Type 'NoneType' cannot be serialized.

<root>
 <sub_component> 
     Hii & heloo <> 
 </sub_component>
</root>

我用于编码的代码如下

from lxml import etree
parser = etree.XMLParser(recover=True) # recover from bad characters.
root = etree.fromstring(file_path, parser=parser)
print etree.tostring(root)

1 个答案:

答案 0 :(得分:0)

您可以使用BeautifulSoup

from bs4 import BeautifulSoup

soup= BeautifulSoup(xml_string, 'html.parser')