当我尝试执行程序时,我不断收到IndexError:list index超出范围。这是我的代码:
''' This program accesses the Bloomberg US Stock information page.
It uses BeautifulSoup to parse the html and then finds the elements with the top 20 stocks.
It finds the the stock name, value, net change, and percent change.
'''
import urllib
from urllib import request
from bs4 import BeautifulSoup
# get the bloomberg stock page
bloomberg_url = "http://www.bloomberg.com/markets/stocks/world-indexes/americas"
try:
response = request.urlopen(bloomberg_url)
except urllib.error.URLError as e:
if hasattr(e, 'reason'):
print('We failed to reach a server.')
print('Reason: ', e.reason)
elif hasattr(e, 'code'):
print('The server couldn\'t fulfill the request.')
print('Error code: ', e.code)
else:
# the url request was successful
html = response.read().decode('utf8')
# use the BeautifulSoup parser to create the beautiful soup object with html structure
bloomsoup = BeautifulSoup(html, 'html.parser')
pagetitle = bloomsoup.title.get_text()
# the 20 stocks are stored in the first 10 "oddrow" tr tags
# and the 10 "evenrow" tr tags
oddrows = bloomsoup.find_all("tr",class_="oddrow")
evenrows = bloomsoup.find_all("tr",class_="evenrow")
# alternate odd and even rows to put all 20 rows together
allrows=[]
for i in range(12):
allrows.append(oddrows[i])
allrows.append(evenrows[i])
allrows.append(oddrows[12])
# iterate over the BeautifulSoup tr tag objects and get the team items into a dictionary
stocklist = [ ]
for item in allrows:
stockdict = { }
stockdict['stockname'] = item.find_all('a')[1].get_text()
stockdict['value'] = item.find("td",class_="pr-rank").get_text()
stockdict['net'] = item.find('span',class_="pr-net").get_text()
stockdict['%'] = item.find('td',align="center").get_text()
stocklist.append(stockdict)
# print the title of the page
print(pagetitle, '\n')
# print out all the teams
for stock in stocklist:
print('Name:', stock['stockname'], 'Value:', stock['value'], 'Net Change:', stock['net'],\
'Percent Change:', stock['%'])
答案 0 :(得分:4)
oddrows
和evenrows
只有10个元素。
20只股票存储在前10"奇数" tr标签和10" evenrow" tr标签
但你循环12次而不是10次:for i in range(12):
将12更改为10,它应该可以正常工作。
旁注:我不建议对该值进行硬编码。
你可以替换
allrows=[]
for i in range(12):
allrows.append(oddrows[i])
allrows.append(evenrows[i])
与
allrows=[]
for x,y in zip(oddrows,evenrows):
allrows.append(x)
allrows.append(y)
答案 1 :(得分:1)
用以下内容替换allrows
循环。它有点怪异,非常容易出错。
import itertools
allrows = [i for z in itertools.zip_longest(oddrows, evenrows) for i in z if i]
如果您不想索引错误/问题,请将其删除。更实用。