我正在尝试从XML中提取<comment>
标记(使用xml.etree.ElementTree
)并找到comment
计数并添加所有数字。我正在使用 urllib 包通过网址阅读该文件。
示例数据:http://python-data.dr-chuck.net/comments_42.xml
但是目前我正在尝试打印这个名字并计算。
import urllib
import xml.etree.ElementTree as ET
serviceurl = 'http://python-data.dr-chuck.net/comments_42.xml'
address = raw_input("Enter location: ")
url = serviceurl + urllib.urlencode({'sensor': 'false', 'address': address})
print ("Retrieving: ", url)
link = urllib.urlopen(url)
data = link.read()
print("Retrieved ", len(data), "characters")
tree = ET.fromstring(data)
tags = tree.findall('.//comment')
for tag in tags:
Name = ''
count = ''
Name = tree.find('commentinfo').find('comments').find('comment').find('name').text
count = tree.find('comments').find('comments').find('comment').find('count').number
print Name, count
不幸的是,我甚至无法将XML文件解析为Python,因为我收到如下错误:
Traceback (most recent call last):
File "ch13_parseXML_assignment.py", line 14, in <module>
tree = ET.fromstring(data)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1300, in XML
parser.feed(text)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1642, in feed
self._raiseerror(v)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror
raise err
xml.etree.ElementTree.ParseError: syntax error: line 1, column 49
我之前在类似的情况下读过,可能是解析器不接受XML文件。预料到这一点,我在Try
周围做了一个Except
和tree = ET.fromstring(data)
,我能够超越这一行,但后来却抛出一个错误,说tree
变量未定义。这违背了我期望的输出目的。
有人可以指点我帮助我的方向吗?