我正在尝试使用BeautifulSoup刮擦股票公司的名称,但是结果“ IndexError:列表索引超出范围”出现。
下面是我的代码
from bs4 import BeautifulSoup
list = ['BABA', 'APPL']
stockname = []
for i in range(len(list)):
stock_company = "https://finance.yahoo.com/quote/"+list[i]
soup = BeautifulSoup(requests.get(stock_company).text,"html.parser").select('h1')[0].text.strip()[10:]
stockname.append(soup)
stockname
答案 0 :(得分:0)
为什么要重新发明轮子?尝试使用yahoo_finance模块:
>>> from yahoo_finance import Share
>>> yahoo = Share('YHOO')
>>> print yahoo.get_open()
'36.60'
>>> print yahoo.get_price()
'36.84'
>>> print yahoo.get_trade_datetime()
'2014-02-05 20:50:00 UTC+0000'
答案 1 :(得分:0)
您可以从“ https://finance.yahoo.com/quote/ {ticker}” URL中删除公司名称,但是所有其他数据(例如数量和价格)都是通过“ https://query1.finance.yahoo.com”通过Ajax加载的。此示例将加载公司名称和收盘价:
import requests
from bs4 import BeautifulSoup
import json
from pprint import pprint
tickers = ['BABA', 'AAPL']
stockname = []
for ticker in tickers:
stock_company = f"https://finance.yahoo.com/quote/{ticker}"
soup = BeautifulSoup(requests.get(stock_company).text, "html.parser")
name = soup.h1.text.split('-')[1].strip()
ticker_data_url = f"https://query1.finance.yahoo.com/v8/finance/chart/{ticker}?region=US&lang=en-US&includePrePost=false&interval=2m&range=1d&corsDomain=finance.yahoo.com&.tsrc=finance"
ticker_data = json.loads(requests.get(ticker_data_url).text)
price = ticker_data['chart']['result'][0]['meta']['previousClose']
if name:
stockname.append( [ticker, name, price] )
pprint(stockname, width=60)
将打印:
[['BABA', 'Alibaba Group Holding Limited', 187.25],
['AAPL', 'Apple Inc.', 191.44]]
Apple的代码为AAPL,而不是APPL。