熊猫和财务计算

时间:2015-03-31 17:40:24

标签: python pandas finance

我是使用Python的新手,我正在研究股票分析脚本。 这个想法是脚本最终将采用股票代码,脚本将计算夏普比率,特雷诺比率和其他财务信息。 现在,我无法让Pandas正常工作。我无法仅从DataFrame访问一列来计算股票的收益率。

from pandas.io.data import DataReader
from datetime import date, timedelta



def calc_yield(now, old):
    return (now-old)/old


def yield_array(cl):
    array = []
    count = 0
    for i in cl:
        old = cl[count]
        count += 1
        new = cl[count]
        array.append(calc_yield(new, old))
    return array


market = '^GSPC'
ticker = "AAPL"
days = 10

# set start and end dates
edate = date.today() - timedelta(days=1)
sdate = edate - timedelta(days=days)

# Read the stock price data from Yahoo
data = DataReader(ticker, 'yahoo', start=sdate, end=edate)

close = data['Adj Close']


print yield_array(close)

错误:

/Users/Tim/anaconda/bin/python "/Users/Tim/PycharmProjects/Test2/module tests.py"
Traceback (most recent call last):
  File "/Users/Tim/PycharmProjects/Test2/module tests.py", line 35, in <module>
    print yield_array(close)
  File "/Users/Tim/PycharmProjects/Test2/module tests.py", line 16, in yield_array
    new = cl[count]
  File "/Users/Tim/anaconda/lib/python2.7/site-packages/pandas/core/series.py", line 484, in __getitem__
    result = self.index.get_value(self, key)
  File "/Users/Tim/anaconda/lib/python2.7/site-packages/pandas/tseries/index.py", line 1243, in get_value
    return _maybe_box(self, Index.get_value(self, series, key), series, key)
  File "/Users/Tim/anaconda/lib/python2.7/site-packages/pandas/core/index.py", line 1202, in get_value
    return tslib.get_value_box(s, key)
  File "tslib.pyx", line 540, in pandas.tslib.get_value_box (pandas/tslib.c:11833)
  File "tslib.pyx", line 555, in pandas.tslib.get_value_box (pandas/tslib.c:11680)
IndexError: index out of bounds

Process finished with exit code 1

1 个答案:

答案 0 :(得分:1)

我想我看到了你的问题。鉴于此功能:

def yield_array(cl):
    array = []
    count = 0
    for i in cl:
        old = cl[count]
        count += 1
        print count
        new = cl[count]
        array.append(calc_yield(new, old))
        print old
        print new
    return array

问题是,在cl的最后一项,您将向count添加1,这将导致索引1大于cl的最大索引。这会导致错误,因为它正在尝试访问不存在的索引。您需要执行for i in cl[:-1]之类的操作,这将跳过最后一个元素。

但是,有一种更简单的方法可以通过矢量化来实现。您可以将整个功能减少到:

close = data['Adj Close']
yield_data = close.diff()/close.shift(1)

或者更好的是,您可以将结果放回DataFrame以供日后使用:

close = data['Adj Close']
data['Yield'] = close.diff()/close.shift(1)