使用python进行格式错误的XML修复

时间:2013-04-21 06:00:25

标签: python xml

我有50个XML文件,它们的标签不匹配,我想用python修复它们。开始标记<names>与结束标记</name>不同。任何人都可以指导我。

    <breakfast_menu>
      <food>
        <names>Belgian Waffles</name>
        <price>$5.95</price>
        <calories>650</calories>
     </food>
    </breakfast_menu>

1 个答案:

答案 0 :(得分:4)

BeautifulSoup这样做:

>>> from bs4 import BeautifulSoup
>>> myxml = # Your posted XML
>>> soup = BeautifulSoup(myxml,'xml')
>>> print soup
<?xml version="1.0" encoding="utf-8"?>
<breakfast_menu>
<food>
<names>Belgian Waffles</names>
<price>$5.95</price>
<calories>650</calories>
</food>
</breakfast_menu>

如果您正在寻找<name></name>

>>> for i in soup.findAll('names'):
...     i.name = 'name'
...
>>> print soup
<?xml version="1.0" encoding="utf-8"?>
<breakfast_menu>
<food>
<name>Belgian Waffles</name>
<price>$5.95</price>
<calories>650</calories>
</food>
</breakfast_menu>