Question

我创建了一个程序，该程序采用股票代码，抓取网络以查找每个股票代码的历史价格的CSV，并使用matplotlib绘制它们。几乎所有东西都运行正常，但我遇到了解析CSV以分离每个价格的问题。

我得到的错误是：

price = [float（row [4]）for csv_rows中的行]

IndexError：列表索引超出范围

我明白了问题所在，我只是不确定应该如何修复它。

（问题出在parseCSV()方法）

# Loop to chart multiple stocks
def chartStocks(*tickers):
    for ticker in tickers:
        chartStock(ticker)

# Single chart stock method
def chartStock(ticker):
    url = "http://finance.yahoo.com/q/hp?s=" + str(ticker) + "+Historical+Prices"
    sourceCode = requests.get(url)
    plainText = sourceCode.text
    soup = BeautifulSoup(plainText, "html.parser")
    csv = findCSV(soup)
    parseCSV(csv)

# Find the CSV URL        
def findCSV(soupPage):
    CSV_URL_PREFIX = 'http://real-chart.finance.yahoo.com/table.csv?s='
    links = soupPage.findAll('a')
    for link in links:
        href = link.get('href', '')
        if href.startswith(CSV_URL_PREFIX):
            return href

# Parse CSV for daily prices
def parseCSV(csv_text):
    csv_rows = csv.reader(csv_text.split('\n'))

    prices = [float(row[4]) for row in csv_rows]
    days = list(range(len(prices)))
    point = collections.namedtuple('Point', ['x', 'y'])

    for price in prices:
        i = 0
        p = point(days[i], prices[i])
        points = []
        points.append(p)
    print(points)

    plotStock(points)

# Plot the data
def plotStock(points):
    plt.plot(points)
    plt.show()

Answer 1

问题是parseCSV()需要一个包含CSV数据的字符串，但它实际上是传递了CSV数据的 URL ，而不是下载的CSV数据。

这是因为findCSV(soup)会为页面上找到的CSV链接返回href的值，然后该值会传递给parseCSV()。 CSV阅读器找到一个不受限制的数据行，因此只有一列，而不是预期的＆gt;。

实际上没有下载CSV数据。

您可以像这样编写parseCSV()的前几行：

def parseCSV(csv_url):
    r = requests.get(csv_url) 
    csv_rows = csv.reader(r.iter_lines())

Answer 2

您需要检查您的行是否至少包含五个元素（即索引位置4）。

prices = [float(row[4]) for row in csv_rows if len(row) > 4]

解析CSV以图表股票报价数据

2 个答案: