Python Beautiful Soup Error:列表索引超出范围

时间:2015-08-15 20:04:55

标签: python

我的脚本出现以下错误,我想做的是,

如果我得到一个正确的网址,我希望它与BeautifulSoup一起查看是否有一个带有值按钮的表单"发送"

追踪(最近一次通话):   文件" tester.py",第27行,in     如果查看状态[0] ['值'] =="发送": IndexError:列表索引超出范围

#!/usr/bin/env python

import urllib2
import sys
import os

url = sys.argv[1]
open_dir_list = open("dirlist.txt",'r')
dirs = open_dir_list.read().split("\n")
open_dir_list.close()

for dir in dirs:

        uri = url+"/"+dir

        try:
                from BeautifulSoup import BeautifulStoneSoup
                response = urllib2.urlopen(uri)
                if response.getcode() == 200:
                        s = uri
                        soup = BeautifulStoneSoup(uri)
                        viewstate = soup.findAll("input", {"type": "submit"})
                        if viewstate[0]['value'] == "Send":
                                print("Yay!!!")
                        else:
                                print("There's nothing here")
        except urllib2.HTTPError, e:
                if e.code == 401:
                        print "[!] Authorization Required %s " % (uri)
                elif e.code == 403:
                        print "[!] Forbidden %s " % (uri)
                elif e.code == 404:
                        print "[-] Not Found %s " % (uri)
                elif e.code == 503:
                        print "[!] Service Unavailable %s " % (uri)
                else:
                        print "[?] Unknwon"



print "\n:. FINISH :.\n"

这个脚本运行正常,但它只检查给定的路径

import urllib
f = urllib.urlopen("http://xxx.xxx.xxx.xxx/button.jsp")
s = f.read()
f.close()

from BeautifulSoup import BeautifulStoneSoup
soup = BeautifulStoneSoup(s)

viewstate = soup.findAll("input", {"type": "submit"})
if viewstate[0]['value'] == "Send":
        print(" Yay!!!")
else:
        print("No Submit Button")

1 个答案:

答案 0 :(得分:1)

除了我在评论中提到的你没有传递返回的html,你将uri = url+"/"+dir而不是response.read()传递给BeautifulSoup,所以你在{{1}中搜索标签我想当然不包含任何标签。您需要传递uri,如下所示:

read

如果您希望第一次匹配使用 response = urllib2.urlopen(uri) if response.getcode() == 200: soup = BeautifulStoneSoup(response.read()) 使用.find来确保它匹配某些内容,您还可以迭代文件对象,一次获取一行:

if viewstate