如何读取单个XML项目

时间:2015-05-07 17:12:08

标签: xml python-2.7 xml-parsing

我尝试使用本教程中的这个示例(这里是a link):

#!/usr/bin/python

import xml.sax

class MovieHandler( xml.sax.ContentHandler ):
   code........
if ( __name__ == "__main__"):

   # create an XMLReader
   parser = xml.sax.make_parser()
   # turn off namepsaces
   parser.setFeature(xml.sax.handler.feature_namespaces, 0)

   # override the default ContextHandler
   Handler = MovieHandler()
   parser.setContentHandler( Handler )

   parser.parse("movies.xml")

将此结果作为输出:

*****Movie*****
Title: Enemy Behind
Type: War, Thriller
Format: DVD
Year: 2003
Rating: PG
Stars: 10
Description: Talk about a US-Japan war
*****Movie*****
Title: Transformers
Type: Anime, Science Fiction
Format: DVD
Year: 1989
Rating: R
Stars: 8
Description: A schientific fiction
*****Movie*****
Title: Trigun
Type: Anime, Action
Format: DVD
Rating: PG
Stars: 10
Description: Vash the Stampede!
*****Movie*****
Title: Ishtar
Type: Comedy
Format: VHS
Rating: PG
Stars: 2
Description: Viewable boredom

假设我只想要这个结果:

*****Movie*****
Title: Enemy Behind
Type: War, Thriller
Format: DVD
Year: 2003
Rating: PG
Stars: 10

或者这个

****Movie*****
    Title: Enemy Behind
    Type: War, Thriller
    Rating: PG
    Stars: 10

我能做些什么?我刚开始学习python& XML最近:

1 个答案:

答案 0 :(得分:1)

这种事情可以通过解析XML来创建DOM树来完成,然后你可以很容易地随机访问查询。

例如,要打印标题为“Enemy,Behind”的电影,您可以执行以下操作:

#!/usr/bin/python

from xml.dom.minidom import parse
import xml.dom.minidom

# Open XML document using minidom parser
DOMTree = xml.dom.minidom.parse("movies.xml")
collection = DOMTree.documentElement
if collection.hasAttribute("shelf"):
    print "Root element : %s" % collection.getAttribute("shelf")

# Get all the movies in the collection
movies = collection.getElementsByTagName("movie")

# Print detail of each movie.
for movie in movies:
    title = movie.getAttribute("title")
    if title == "Enemy Behind":
         print "*****Movie*****"
         print "Title: %s" % title

         type = movie.getElementsByTagName('type')[0]
         print "Type: %s" % type.childNodes[0].data
         format = movie.getElementsByTagName('format')[0]
         print "Format: %s" % format.childNodes[0].data
         rating = movie.getElementsByTagName('rating')[0]
         print "Rating: %s" % rating.childNodes[0].data
         description = movie.getElementsByTagName('description')[0]
         print "Description: %s" % description.childNodes[0].data