我有一个来自xmltodict的基本示例,它紧跟在项目github页面上给出的示例。
def handle(_, book):
print(book['title'])
return True
with open(r'C:\Users\u369811\books.xml', 'r') as f:
FILE = f.read()
OUTPUT = xmltodict.parse((FILE), item_depth=2, item_callback=handle)
print(OUTPUT)
这个xml
<?xml version="1.0"?>
<catalog>
<book id="bk101">
<author>Gambardella, Matthew</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at creating applications
with XML.</description>
</book>
<book id="bk102">
<author>Ralls, Kim</author>
<title>Midnight Rain</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-12-16</publish_date>
<description>A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.</description>
</book>
<book id="bk103">
<author>Corets, Eva</author>
<title>Maeve Ascendant</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-11-17</publish_date>
<description>After the collapse of a nanotechnology
society in England, the young survivors lay the
foundation for a new society.</description>
</book>
<book id="bk104">
<author>Corets, Eva</author>
<title>Oberon's Legacy</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2001-03-10</publish_date>
<description>In post-apocalypse England, the mysterious
agent known only as Oberon helps to create a new life
for the inhabitants of London. Sequel to Maeve
Ascendant.</description>
</book>
<book id="bk105">
<author>Corets, Eva</author>
<title>The Sundered Grail</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2001-09-10</publish_date>
<description>The two daughters of Maeve, half-sisters,
battle one another for control of England. Sequel to
Oberon's Legacy.</description>
</book>
<book id="bk106">
<author>Randall, Cynthia</author>
<title>Lover Birds</title>
<genre>Romance</genre>
<price>4.95</price>
<publish_date>2000-09-02</publish_date>
<description>When Carla meets Paul at an ornithology
conference, tempers fly as feathers get ruffled.</description>
</book>
<book id="bk107">
<author>Thurman, Paula</author>
<title>Splish Splash</title>
<genre>Romance</genre>
<price>4.95</price>
<publish_date>2000-11-02</publish_date>
<description>A deep sea diver finds true love twenty
thousand leagues beneath the sea.</description>
</book>
<book id="bk108">
<author>Knorr, Stefan</author>
<title>Creepy Crawlies</title>
<genre>Horror</genre>
<price>4.95</price>
<publish_date>2000-12-06</publish_date>
<description>An anthology of horror stories about roaches,
centipedes, scorpions and other insects.</description>
</book>
<book id="bk109">
<author>Kress, Peter</author>
<title>Paradox Lost</title>
<genre>Science Fiction</genre>
<price>6.95</price>
<publish_date>2000-11-02</publish_date>
<description>After an inadvertant trip through a Heisenberg
Uncertainty Device, James Salway discovers the problems
of being quantum.</description>
</book>
<book id="bk110">
<author>O'Brien, Tim</author>
<title>Microsoft .NET: The Programming Bible</title>
<genre>Computer</genre>
<price>36.95</price>
<publish_date>2000-12-09</publish_date>
<description>Microsoft's .NET initiative is explored in
detail in this deep programmer's reference.</description>
</book>
<book id="bk111">
<author>O'Brien, Tim</author>
<title>MSXML3: A Comprehensive Guide</title>
<genre>Computer</genre>
<price>36.95</price>
<publish_date>2000-12-01</publish_date>
<description>The Microsoft MSXML3 parser is covered in
detail, with attention to XML DOM interfaces, XSLT processing,
SAX and more.</description>
</book>
<book id="bk112">
<author>Galos, Mike</author>
<title>Visual Studio 7: A Comprehensive Guide</title>
<genre>Computer</genre>
<price>49.95</price>
<publish_date>2001-04-16</publish_date>
<description>Microsoft Visual Studio 7 is explored in depth,
looking at how Visual Basic, Visual C++, C#, and ASP+ are
integrated into a comprehensive development
environment.</description>
</book>
</catalog>
这打印出所有书籍,但书籍标题是NoneType,我不能迭代输出或强制它们进入列表。
如何将返回的输出作为字符串列表?
答案 0 :(得分:1)
我遇到了与xmltodict类似的问题,所以我修改了this代码段,这就是我所得到的:
#!/usr/bin/env python3
# -*- coding: utf8 -*-
import xml.etree.cElementTree as ElementTree
class Parser:
def __init__(self, file):
self.root = ElementTree.parse(file).getroot()
class Dict(dict):
def __init__(self, parent_element, **kwargs):
super().__init__(**kwargs)
if parent_element.items():
self.update(dict(parent_element.items()))
for element in parent_element:
if element:
# treat like dict - we assume that if the first two tags
# in a series are different, then they are all different.
if len(element) == 1 or element[0].tag != element[1].tag:
item = Parser.Dict(element)
# treat like list - we assume that if the first two tags
# in a series are the same, then the rest are the same.
else:
# here, we put the list in dictionary; the key is the
# tag name the list elements all share in common, and
# the value is the list itself
item = {element[0].tag: Parser.List(element)}
# if the tag has attributes, add those to the dict
if element.items():
item.update(dict(element.items()))
self.update({element.tag: item})
# this assumes that if you've got an attribute in a tag,
# you won't be having any text. This may or may not be a
# good idea -- time will tell. It works for the way we are
# currently doing XML configuration files...
elif element.items():
self.update({element.tag: dict(element.items())})
# finally, if there are no child tags and no attributes, extract
# the text
else:
self.update({element.tag: element.text})
class List(list):
def __init__(self, item):
super().__init__()
for element in item:
if element:
# treat like dict
if len(element) == 1 or element[0].tag != element[1].tag:
self.append(Parser.Dict(element))
# treat like list
elif element[0].tag == element[1].tag:
self.append(Parser.List(element))
elif element.text:
text = element.text.strip()
if text:
self.append(text)
elif element.items():
self.append(dict(element.items()))
@property
def parsed(self):
if self.root.items():
return Parser.Dict(self.root)
else:
return {self.root.tag: Parser.List(self.root)}
if __name__ == "__main__":
import pprint
pprint.pprint(Parser('PATH_TO_YOUR_XML.xml').parsed)
在您的示例输出中将是:
{'catalog': [{'author': 'Gambardella, Matthew',
'description': 'An in-depth look at creating applications \n'
' with XML.',
'genre': 'Computer',
'id': 'bk101',
'price': '44.95',
'publish_date': '2000-10-01',
'title': "XML Developer's Guide"},
{'author': 'Ralls, Kim',
'description': 'A former architect battles corporate zombies, \n'
' an evil sorceress, and her own childhood '
'to become queen \n'
' of the world.',
'genre': 'Fantasy',
'id': 'bk102',
'price': '5.95',
'publish_date': '2000-12-16',
'title': 'Midnight Rain'},
{'author': 'Corets, Eva',
'description': 'After the collapse of a nanotechnology \n'
' society in England, the young survivors '
'lay the \n'
' foundation for a new society.',
'genre': 'Fantasy',
'id': 'bk103',
'price': '5.95',
'publish_date': '2000-11-17',
'title': 'Maeve Ascendant'},
{'author': 'Corets, Eva',
'description': 'In post-apocalypse England, the mysterious \n'
' agent known only as Oberon helps to create '
'a new life \n'
' for the inhabitants of London. Sequel to '
'Maeve \n'
' Ascendant.',
'genre': 'Fantasy',
'id': 'bk104',
'price': '5.95',
'publish_date': '2001-03-10',
'title': "Oberon's Legacy"},
{'author': 'Corets, Eva',
'description': 'The two daughters of Maeve, half-sisters, \n'
' battle one another for control of England. '
'Sequel to \n'
" Oberon's Legacy.",
'genre': 'Fantasy',
'id': 'bk105',
'price': '5.95',
'publish_date': '2001-09-10',
'title': 'The Sundered Grail'},
{'author': 'Randall, Cynthia',
'description': 'When Carla meets Paul at an ornithology \n'
' conference, tempers fly as feathers get '
'ruffled.',
'genre': 'Romance',
'id': 'bk106',
'price': '4.95',
'publish_date': '2000-09-02',
'title': 'Lover Birds'},
{'author': 'Thurman, Paula',
'description': 'A deep sea diver finds true love twenty \n'
' thousand leagues beneath the sea.',
'genre': 'Romance',
'id': 'bk107',
'price': '4.95',
'publish_date': '2000-11-02',
'title': 'Splish Splash'},
{'author': 'Knorr, Stefan',
'description': 'An anthology of horror stories about roaches,\n'
' centipedes, scorpions and other insects.',
'genre': 'Horror',
'id': 'bk108',
'price': '4.95',
'publish_date': '2000-12-06',
'title': 'Creepy Crawlies'},
{'author': 'Kress, Peter',
'description': 'After an inadvertant trip through a Heisenberg\n'
' Uncertainty Device, James Salway discovers '
'the problems \n'
' of being quantum.',
'genre': 'Science Fiction',
'id': 'bk109',
'price': '6.95',
'publish_date': '2000-11-02',
'title': 'Paradox Lost'},
{'author': "O'Brien, Tim",
'description': "Microsoft's .NET initiative is explored in \n"
" detail in this deep programmer's "
'reference.',
'genre': 'Computer',
'id': 'bk110',
'price': '36.95',
'publish_date': '2000-12-09',
'title': 'Microsoft .NET: The Programming Bible'},
{'author': "O'Brien, Tim",
'description': 'The Microsoft MSXML3 parser is covered in \n'
' detail, with attention to XML DOM '
'interfaces, XSLT processing, \n'
' SAX and more.',
'genre': 'Computer',
'id': 'bk111',
'price': '36.95',
'publish_date': '2000-12-01',
'title': 'MSXML3: A Comprehensive Guide'},
{'author': 'Galos, Mike',
'description': 'Microsoft Visual Studio 7 is explored in depth,\n'
' looking at how Visual Basic, Visual C++, '
'C#, and ASP+ are \n'
' integrated into a comprehensive '
'development \n'
' environment.',
'genre': 'Computer',
'id': 'bk112',
'price': '49.95',
'publish_date': '2001-04-16',
'title': 'Visual Studio 7: A Comprehensive Guide'}]}
答案 1 :(得分:0)
直接解析结果而不是使用回调可能更容易。 e.g。
import xmltodict
with open(r'C:\Users\u369811\books.xml', 'r') as f:
FILE = f.read()
OUTPUT = xmltodict.parse(FILE)
titles = [book['title'] for book in OUTPUT['catalog']['book']]
for title in titles:
print(title)
我系统上的输出
XML Developer's Guide
Midnight Rain
Maeve Ascendant
Oberon's Legacy
The Sundered Grail
Lover Birds
Splish Splash
Creepy Crawlies
Paradox Lost
Microsoft .NET: The Programming Bible
MSXML3: A Comprehensive Guide
Visual Studio 7: A Comprehensive Guide