我已经完成了一项任务,我必须从this xml file中提取数字然后总结它们。问题是,当我尝试执行for循环以获取数据时,我收到属性错误:
TypeError:' NoneType'对象不可调用
到目前为止,这是我的代码:
import urllib
import xml.etree.ElementTree as ET
url = raw_input('Enter location: ')
print 'Retrieving', url
uh = urllib.urlopen(url)
data = uh.read()
print 'Retrieved',len(data),'characters'
tree = ET.fromstring(data)
lst = tree.findall('.//count')
print 'Count:', len(lst)
for item in lst:
print 'name', item.find('count').text
我应该从count标签中提取文本:
<comment>
<name>Matthias</name>
<count>97</count>
</comment>
我在这里找不到什么东西?
答案 0 :(得分:1)
我建议使用Beautiful Soup。 http://www.crummy.com/software/BeautifulSoup/bs4/doc/
它使得解析xml文件非常容易
from bs4 import BeautifulSoup
soup = BeautifulSoup(data) # It seems that your data variable holds the xml
for tag in soup.find_all('count'):
print tag.get_text()
答案 1 :(得分:1)
我尝试了我的代码并且成功了!
import urllib
import xml.etree.ElementTree as ET
url = raw_input('Enter location: ')
print 'Retrieving', url
uh = urllib.urlopen(url)
data = uh.read()
print 'Retrieved',len(data),'characters'
tree = ET.fromstring(data)
lst = tree.findall('.//count')
print 'Count:', len(lst)
total = 0
for comment in tree.findall("./comments/comment"):
total += int(comment.find('count').text)
print total
答案 2 :(得分:0)
A.K。像Gabor Erdos说的那样。 BS4要好得多,但如果你想使用ElementTree:
import urllib
import xml.etree.ElementTree as ET
url = 'http://python-data.dr-chuck.net/comments_208135.xml'
tresc = urllib.urlopen(url).read()
tree = ET.fromstring(tresc)
wartosci = tree.findall('.//count')
sum = 0
count = 0
for item in wartosci:
x = int(item.text)
sum = sum + x
count = count + 1
print('Retrieving', url)
print('Retrieving'), len(tresc), 'characters'
print('Count: ', count)
print('Sum: ', sum)
答案 3 :(得分:0)
对于Python 3,这个无麻烦的代码应该适用于你的任务:
StackPanel
答案 4 :(得分:0)
import urllib.request, urllib.parse, urllib.error
import xml.etree.ElementTree as ET
url = 'http://py4e-data.dr-chuck.net/comments_4772.xml'
print ('Retrieving', url)
uh = urllib.request.urlopen(url)
data = uh.read()
print('Retrieved', len(data), 'characters')
tree = ET.fromstring(data)
counts = tree.findall('.//count')
print ('Count',len(counts))
total=0
for item in counts:
element=item.text
total+=int(element)
print(total)