Question

当我尝试执行程序时，我不断收到IndexError：list index超出范围。这是我的代码：

''' This program accesses the Bloomberg US Stock information page.
    It uses BeautifulSoup to parse the html and then finds the elements with the top 20 stocks.
    It finds the the stock name, value, net change, and percent change. 
'''

import urllib
from urllib import request
from bs4 import BeautifulSoup


# get the bloomberg stock page
bloomberg_url = "http://www.bloomberg.com/markets/stocks/world-indexes/americas"

try:
    response = request.urlopen(bloomberg_url)
except urllib.error.URLError as e:
    if hasattr(e, 'reason'):
        print('We failed to reach a server.')
        print('Reason: ', e.reason)
    elif hasattr(e, 'code'):
        print('The server couldn\'t fulfill the request.')
        print('Error code: ', e.code)
else:
    # the url request was successful
    html = response.read().decode('utf8')

    # use the BeautifulSoup parser to create the beautiful soup object with html structure
    bloomsoup = BeautifulSoup(html, 'html.parser')
    pagetitle = bloomsoup.title.get_text()

    # the 20 stocks are stored in the first 10 "oddrow" tr tags
    #   and the 10 "evenrow" tr tags
    oddrows = bloomsoup.find_all("tr",class_="oddrow")
    evenrows = bloomsoup.find_all("tr",class_="evenrow")

    # alternate odd and even rows to put all 20 rows together
    allrows=[]
    for i in range(12):
        allrows.append(oddrows[i])
        allrows.append(evenrows[i])
    allrows.append(oddrows[12])

    # iterate over the BeautifulSoup tr tag objects and get the team items into a dictionary
    stocklist = [ ]
    for item in allrows:
        stockdict = { }
        stockdict['stockname'] = item.find_all('a')[1].get_text()
        stockdict['value'] = item.find("td",class_="pr-rank").get_text()
        stockdict['net'] = item.find('span',class_="pr-net").get_text()
        stockdict['%'] = item.find('td',align="center").get_text()
        stocklist.append(stockdict)

    # print the title of the page
    print(pagetitle, '\n')

    # print out all the teams
    for stock in stocklist:
        print('Name:', stock['stockname'], 'Value:', stock['value'], 'Net Change:', stock['net'],\
            'Percent Change:', stock['%'])

Answer 1

根据您的评论，

oddrows和evenrows只有10个元素。

20只股票存储在前10＆＃34;奇数＆＃34; tr标签和10＆＃34; evenrow＆＃34; tr标签

但你循环12次而不是10次：for i in range(12):

将12更改为10，它应该可以正常工作。

旁注：我不建议对该值进行硬编码。

你可以替换

allrows=[]
for i in range(12):
    allrows.append(oddrows[i])
    allrows.append(evenrows[i])

与

allrows=[]
for x,y in zip(oddrows,evenrows):
    allrows.append(x)
    allrows.append(y)

Answer 2

用以下内容替换allrows循环。它有点怪异，非常容易出错。

import itertools
allrows = [i for z in itertools.zip_longest(oddrows, evenrows) for i in z if i]

如果您不想索引错误/问题，请将其删除。更实用。

Python - 从URL获取数据的程序的IndexError

2 个答案: