我试图解析这个:http://www.codespot.blogspot.in/atom.xml?redirect=false&start-index=1&max-results=500
问题是:
我将xml存储在ElementTree的文件中以解析它。如何避免它,只是从GET请求中解析字符串响应?
虽然我这样做,但要获得所有标题,它仍然无法运作:
f = open('output.xml','wb+')
f.write(r.content)
f.close()
tree = ""
with open('output.xml', 'rt') as f:
tree = ElementTree.parse(f)
print tree
root = tree.getroot()
for elem in tree.iter():
print elem.tag, elem.attrib
for atype in tree.findall('title'):
print atype.contents
答案 0 :(得分:2)
import urllib2
from xml.etree import cElementTree as ET
conn = urllib2.urlopen("http://www.codespot.blogspot.in/atom.xml?redirect=false&start-index=1&max-results=500")
myins=ET.parse(conn)
for elem in myins.findall('{http://www.w3.org/2005/Atom}entry/{http://www.w3.org/2005/Atom}title'):
print elem.text
或找到标题和内容::
for elem in myins.findall('{http://www.w3.org/2005/Atom}entry'):
print elem.find('{http://www.w3.org/2005/Atom}title').text ## this will be the title
print elem.find('{http://www.w3.org/2005/Atom}content').text ## this will be the content