解析XML响应:列表索引超出范围错误

时间:2013-08-12 13:51:02

标签: python xml

我从一个论坛获得了这个脚本,它不断出现以下错误

Traceback (most recent call last):
  File "test.py", line 42, in <module> main()
  File "test.py", line 28, in main
    bot_response = objektid[0].toxml()
IndexError: list index out of range

我一直在寻找答案,但是,我无法将答案与我的代码联系起来,也许是因为我是python这样的菜鸟。

脚本如下。

#!/usr/bin/python -tt

# Have a conversation with a PandaBot AI
# Author A.Roots

import urllib, urllib2
import sys
from xml.dom import minidom
from xml.sax.saxutils import unescape

def main():

  human_input = raw_input('You: ')
  if human_input == 'exit':
    sys.exit(0)

  base_url = 'http://www.pandorabots.com/pandora/talk-xml'
  data = urllib.urlencode([('botid', 'ebbf27804e3458c5'), ('input', human_input)])

  # Submit POST data and download response XML
  req = urllib2.Request(base_url)
  fd = urllib2.urlopen(req, data)

  # Take Bot's response out of XML
  xmlFile = fd.read()
  dom = minidom.parseString(xmlFile)
  objektid = dom.getElementsByTagName('that')
  bot_response = objektid[0].toxml()
  bot_response = bot_response[6:]
  bot_response = bot_response[:-7]
  # Some nasty unescaping
  bot_response = unescape(bot_response, {"&amp;apos;": "'", "&amp;quot;": '"'})

  print 'Getter:',str(bot_response)

  # Repeat until terminated
  while 1:
    main()

if __name__ == '__main__':
  print 'Hi. You can now talk to Getter. Type "exit" when done.'
  main()

非常感谢您对此的帮助

2 个答案:

答案 0 :(得分:5)

找不到元素<that>

objektid = dom.getElementsByTagName('that')

所以列表是空的。

测试代码,我收到消息:

<result status="3" botid="ebbf27804e3458c5"><input>Hello world!</input><message>Failed to find bot</message></result>

其中不包含此类标记。错误消息似乎表明您使用的特定机器人ID不存在或不存在。也许你需要在Pandorabots homepage上注册一个自己的新机器人?

我注意到你正在做一些讨厌的unescaping 。为什么不抓取该标签下的文本节点,让DOM库为您处理?

您可能希望查看ElementTree API(包含在Python中),因为它更易于使用。

答案 1 :(得分:1)

问题出在这里

   objektid = dom.getElementsByTagName('that')
   bot_response = objektid[0].toxml()

如果dom.getElementsByTagName根本没有返回任何内容,那么objektid [0],objektid的第一个元素将不存在。因此错!

要绕过它,请执行类似

的操作
  objektid = dom.getElementsByTagName('that')
  if len(objektid) >= 0:
      bot_response = objektid[0].toxml()
      bot_response = bot_response[6:]
      bot_response = bot_response[:-7]
      # Some nasty unescaping
      bot_response = unescape(bot_response, {"&amp;apos;": "'", "&amp;quot;": '"'})
  else:
      bot_response = ""

  print 'Getter:',str(bot_response)