我正在运行for循环以从某些XML中获取内容并且它工作正常,直到我到达第29次迭代。那时它给了我这个错误:
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2\webapp2.py", line 572, in dispatch
return self.handle_exception(e, self.app.debug)
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2\webapp2.py", line 570, in dispatch
return method(*args, **kwargs)
File "J:\Art & Graphic Design\Graphic Design\Websites\lawvoter-dev\cron_congressman.py", line 64, in get
birthday = re.findall("<birthday>(.*)</birthday>",element)[0]
IndexError: list index out of range
代码是:
for element in members:
title = re.findall("<title>(.*)</title>",element)[0]
role = re.findall("<role_type_label>(.*)</role_type_label>",element)[0]
name_sortable = re.findall("<name_sortable>(.*)</name_sortable>",element)[0]
firstname = re.findall("<firstname>(.*)</firstname>",element)[0]
lastname = re.findall("<lastname>(.*)</lastname>",element)[0]
gender = re.findall("<gender_label>(.*)</gender_label>",element)[0]
birthday = re.findall("<birthday>(.*)</birthday>",element)[0]
party = re.findall("<party>(.*)</party>",element)[0]
state = re.findall("<state>(.*)</state>",element)[0]
description = re.findall("<description>(.*)</description>",element)[0]
start_date = re.findall("<startdate>(.*)</startdate>",element)[0]
end_date = re.findall("<enddate>(.*)</enddate>",element)[0]
website = re.findall("<website>(.*)</website>",element)[0]
bioguideid = re.findall("<bioguideid>(.*)</bioguideid>",element)[0]
osid = re.findall("<osid>(.*)</osid>",element)[0]
pvsid = re.findall("<pvsid>(.*)</pvsid>",element)[0]
twitterid = re.findall("<twitterid>(.*)</twitterid>",element)[0]
youtubeid = re.findall("<youtubeid>(.*)</youtubeid>",element)[0]
member = Congressman(title=title, role=role, name_sortable=name_sortable, firstname=firstname, lastname=lastname, gender=gender, birthday=birthday, party=party, state=state,
description=description, start_date=start_date, end_date=end_date, website=website, bioguideid=bioguideid, osid=osid, pvsid=pvsid, twitterid=twitterid, youtubeid=youtubeid)
member.put()
我真的不明白为什么会出现这个错误?它总是适用于前29次迭代?为了以防万一,数据模型中的每个元素也设置为“default = None”。但是,当我查看XML本身,并转到错误发生的确切行时,该值实际上就在那里。任何人都知道为什么即使值存在也会出错?
答案 0 :(得分:1)
看起来像
birthday = re.findall("<birthday>(.*)</birthday>",element)[0]
返回一个空列表,你试图提取不在列表中的第一个元素并抛出
IndexError: list index out of range
喜欢这里:
>>> l = []
>>> l[0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: list index out of range
>>>
编辑:
import re, logging
def findelement(item, element):
i = re.findall(item, element)
if not i:
logging.info('no item found for %s with element %s' %(item, element))
return ''
return i[0]
for element in members:
title = findelement("<title>(.*)</title>", element)
...