蟒蛇大熊猫的相对强弱指数

时间:2013-12-11 17:51:38

标签: python pandas finance

我是熊猫的新手。计算大熊猫RSI指标中相对强度部分的最佳方法是什么?到目前为止,我得到了以下内容:

from pylab import *
import pandas as pd
import numpy as np



def Datapull(Stock):
    try:
        df = (pd.io.data.DataReader(Stock,'yahoo',start='01/01/2010'))
        return df
        print 'Retrieved', Stock
        time.sleep(5)
    except Exception, e:
        print 'Main Loop', str(e)


def RSIfun(price, n=14):
    delta = price['Close'].diff()
    #-----------
    dUp=
    dDown=

    RolUp=pd.rolling_mean(dUp, n)
    RolDown=pd.rolling_mean(dDown, n).abs()

    RS = RolUp / RolDown
    rsi= 100.0 - (100.0 / (1.0 + RS))
    return rsi

Stock='AAPL'
df=Datapull(Stock)
RSIfun(df)

到目前为止我做得对吗?我在方程式的差异部分遇到麻烦,你将向上和向下计算分开

12 个答案:

答案 0 :(得分:32)

值得注意的是,有多种方法可以定义RSI。它通常以至少两种方式定义:使用如上所述的简单移动平均值(SMA),或使用指数移动平均值(EMA)。这是一个代码片段,它计算RSI的两个定义并绘制它们以进行比较。我在取消差异之后丢弃第一行,因为根据定义,它总是NaN。

请注意,使用EMA时必须要小心:因为它包含一个回溯到数据开头的内存,结果取决于你从哪里开始!出于这个原因,通常人们会在开头添加一些数据,比如100个时间步,然后切掉前100个RSI值。

在下图中,可以看到使用SMA和EMA计算的RSI之间的差异:SMA趋向于更敏感。请注意,基于EMA的RSI在第一个时间步骤(由于丢弃第一行而是原始周期的第二个时间步长)具有其第一个有限值,而基于SMA的RSI在第一个时间点具有其第一个有限值。第14次。这是因为默认情况下,一旦有足够的值来填充窗口,rolling_mean()只返回一个有限值。

A comparison of the RSI calculated using exponential or simple moving average

import pandas
import pandas.io.data
import datetime
import matplotlib.pyplot as plt

# Window length for moving average
window_length = 14

# Dates
start = datetime.datetime(2010, 1, 1)
end = datetime.datetime(2013, 1, 27)

# Get data
data = pandas.io.data.DataReader('AAPL', 'yahoo', start, end)
# Get just the close
close = data['Adj Close']
# Get the difference in price from previous step
delta = close.diff()
# Get rid of the first row, which is NaN since it did not have a previous 
# row to calculate the differences
delta = delta[1:] 

# Make the positive gains (up) and negative gains (down) Series
up, down = delta.copy(), delta.copy()
up[up < 0] = 0
down[down > 0] = 0

# Calculate the EWMA
roll_up1 = pandas.stats.moments.ewma(up, window_length)
roll_down1 = pandas.stats.moments.ewma(down.abs(), window_length)

# Calculate the RSI based on EWMA
RS1 = roll_up1 / roll_down1
RSI1 = 100.0 - (100.0 / (1.0 + RS1))

# Calculate the SMA
roll_up2 = pandas.rolling_mean(up, window_length)
roll_down2 = pandas.rolling_mean(down.abs(), window_length)

# Calculate the RSI based on SMA
RS2 = roll_up2 / roll_down2
RSI2 = 100.0 - (100.0 / (1.0 + RS2))

# Compare graphically
plt.figure()
RSI1.plot()
RSI2.plot()
plt.legend(['RSI via EWMA', 'RSI via SMA'])
plt.show()

答案 1 :(得分:15)

dUp= delta[delta > 0]
dDown= delta[delta < 0]

你也需要这样的东西:

RolUp = RolUp.reindex_like(delta, method='ffill')
RolDown = RolDown.reindex_like(delta, method='ffill')

否则RS = RolUp / RolDown将无法满足您的需求

编辑:似乎这是一种更准确的RS计算方式:

# dUp= delta[delta > 0]
# dDown= delta[delta < 0]

# dUp = dUp.reindex_like(delta, fill_value=0)
# dDown = dDown.reindex_like(delta, fill_value=0)

dUp, dDown = delta.copy(), delta.copy()
dUp[dUp < 0] = 0
dDown[dDown > 0] = 0

RolUp = pd.rolling_mean(dUp, n)
RolDown = pd.rolling_mean(dDown, n).abs()

RS = RolUp / RolDown

答案 2 :(得分:9)

我的答案在StockCharts样本数据上进行了测试。

[StockChart RSI info] [1] http://stockcharts.com/school/doku.php?id=chart_school:technical_indicators:relative_strength_index_rsi

def RSI(series, period):
    delta = series.diff().dropna()
    u = delta * 0
    d = u.copy()
    u[delta > 0] = delta[delta > 0]
    d[delta < 0] = -delta[delta < 0]
    u[u.index[period-1]] = np.mean( u[:period] ) #first value is sum of avg gains
    u = u.drop(u.index[:(period-1)])
    d[d.index[period-1]] = np.mean( d[:period] ) #first value is sum of avg losses
    d = d.drop(d.index[:(period-1)])
    rs = pd.stats.moments.ewma(u, com=period-1, adjust=False) / \
         pd.stats.moments.ewma(d, com=period-1, adjust=False)
    return 100 - 100 / (1 + rs)


#sample data from StockCharts
data = pd.Series( [ 44.34, 44.09, 44.15, 43.61,
                    44.33, 44.83, 45.10, 45.42,
                    45.84, 46.08, 45.89, 46.03,
                    45.61, 46.28, 46.28, 46.00,
                    46.03, 46.41, 46.22, 45.64 ] )
print RSI( data, 14 )

#output
14    70.464135
15    66.249619
16    66.480942
17    69.346853
18    66.294713
19    57.915021

答案 3 :(得分:4)

您可以将rolling_apply与子函数结合使用来创建一个干净的函数:

def rsi(price, n=14):
    ''' rsi indicator '''
    gain = (price-price.shift(1)).fillna(0) # calculate price gain with previous day, first row nan is filled with 0

    def rsiCalc(p):
        # subfunction for calculating rsi for one lookback period
        avgGain = p[p>0].sum()/n
        avgLoss = -p[p<0].sum()/n 
        rs = avgGain/avgLoss
        return 100 - 100/(1+rs)

    # run for all periods with rolling_apply
    return pd.rolling_apply(gain,n,rsiCalc) 

答案 4 :(得分:4)

我也遇到了这个问题,正在按照 Jev 采取的rolling_apply路径进行工作。但是,当我测试我的结果时,他们并没有与我使用的商业股票图表程序相匹配,例如StockCharts.com或thinkorswim。所以我做了一些挖掘并发现当Welles Wilder创建RSI时,他使用了一种现在被称为Wilder Smoothing的平滑技术。上述商业服务使用Wilder Smoothing而不是简单的移动平均线来计算平均收益和损失。

我是Python(和Pandas)的新手,所以我想知道是否有一些很好的方法来重构下面的 for 循环以使其更快。也许其他人可以评论这种可能性。

我希望你觉得这很有用。

More info here

def get_rsi_timeseries(prices, n=14):
    # RSI = 100 - (100 / (1 + RS))
    # where RS = (Wilder-smoothed n-period average of gains / Wilder-smoothed n-period average of -losses)
    # Note that losses above should be positive values
    # Wilder-smoothing = ((previous smoothed avg * (n-1)) + current value to average) / n
    # For the very first "previous smoothed avg" (aka the seed value), we start with a straight average.
    # Therefore, our first RSI value will be for the n+2nd period:
    #     0: first delta is nan
    #     1:
    #     ...
    #     n: lookback period for first Wilder smoothing seed value
    #     n+1: first RSI

    # First, calculate the gain or loss from one price to the next. The first value is nan so replace with 0.
    deltas = (prices-prices.shift(1)).fillna(0)

    # Calculate the straight average seed values.
    # The first delta is always zero, so we will use a slice of the first n deltas starting at 1,
    # and filter only deltas > 0 to get gains and deltas < 0 to get losses
    avg_of_gains = deltas[1:n+1][deltas > 0].sum() / n
    avg_of_losses = -deltas[1:n+1][deltas < 0].sum() / n

    # Set up pd.Series container for RSI values
    rsi_series = pd.Series(0.0, deltas.index)

    # Now calculate RSI using the Wilder smoothing method, starting with n+1 delta.
    up = lambda x: x if x > 0 else 0
    down = lambda x: -x if x < 0 else 0
    i = n+1
    for d in deltas[n+1:]:
        avg_of_gains = ((avg_of_gains * (n-1)) + up(d)) / n
        avg_of_losses = ((avg_of_losses * (n-1)) + down(d)) / n
        if avg_of_losses != 0:
            rs = avg_of_gains / avg_of_losses
            rsi_series[i] = 100 - (100 / (1 + rs))
        else:
            rsi_series[i] = 100
        i += 1

    return rsi_series

答案 5 :(得分:1)

使用numba可以大大提高Bill的答案。 100个2万行系列的循环(常规= 113秒,数字= 0.28秒)。 Numba擅长循环和算术。

import numpy as np
import numba as nb

@nb.jit(fastmath=True, nopython=True)   
def calc_rsi( array, deltas, avg_gain, avg_loss, n ):

    # Use Wilder smoothing method
    up   = lambda x:  x if x > 0 else 0
    down = lambda x: -x if x < 0 else 0
    i = n+1
    for d in deltas[n+1:]:
        avg_gain = ((avg_gain * (n-1)) + up(d)) / n
        avg_loss = ((avg_loss * (n-1)) + down(d)) / n
        if avg_loss != 0:
            rs = avg_gain / avg_loss
            array[i] = 100 - (100 / (1 + rs))
        else:
            array[i] = 100
        i += 1

    return array

def get_rsi( array, n = 14 ):   

    deltas = np.append([0],np.diff(array))

    avg_gain =  np.sum(deltas[1:n+1].clip(min=0)) / n
    avg_loss = -np.sum(deltas[1:n+1].clip(max=0)) / n

    array = np.empty(deltas.shape[0])
    array.fill(np.nan)

    array = calc_rsi( array, deltas, avg_gain, avg_loss, n )
    return array

rsi = get_rsi( array or series, 14 )

答案 6 :(得分:0)

def RSI(series):
    delta = series.diff()
    u = delta * 0 
    d = u.copy()
    i_pos = delta > 0
    i_neg = delta < 0
    u[i_pos] = delta[i_pos]
    d[i_neg] = delta[i_neg]
    rs = moments.ewma(u, span=27) / moments.ewma(d, span=27)
    return 100 - 100 / (1 + rs)

答案 7 :(得分:0)

# Relative Strength Index
# Avg(PriceUp)/(Avg(PriceUP)+Avg(PriceDown)*100
# Where: PriceUp(t)=1*(Price(t)-Price(t-1)){Price(t)- Price(t-1)>0};
#        PriceDown(t)=-1*(Price(t)-Price(t-1)){Price(t)- Price(t-1)<0};
# Change the formula for your own requirement
def rsi(values):
    up = values[values>0].mean()
    down = -1*values[values<0].mean()
    return 100 * up / (up + down)

stock['RSI_6D'] = stock['Momentum_1D'].rolling(center=False,window=6).apply(rsi)
stock['RSI_12D'] = stock['Momentum_1D'].rolling(center=False,window=12).apply(rsi)

Momentum_1D = Pt - P(t-1)其中P是收盘价,t是日期

答案 8 :(得分:0)

rsi_Indictor(close,n_days):
    rsi_series = pd.DataFrame(close)


    # Change = close[i]-Change[i-1]
    rsi_series["Change"] = (rsi_series["Close"] - rsi_series["Close"].shift(1)).fillna(0)

    # Upword Movement
    rsi_series["Upword Movement"] = (rsi_series["Change"][rsi_series["Change"] >0])
    rsi_series["Upword Movement"] = rsi_series["Upword Movement"].fillna(0)

    # Downword Movement
    rsi_series["Downword Movement"] = (abs(rsi_series["Change"])[rsi_series["Change"] <0]).fillna(0)
    rsi_series["Downword Movement"] = rsi_series["Downword Movement"].fillna(0)

    #Average Upword Movement
    # For first Upword Movement Mean of first n elements.
    rsi_series["Average Upword Movement"] = 0.00
    rsi_series["Average Upword Movement"][n] = rsi_series["Upword Movement"][1:n+1].mean()

    # For Second onwords
    for i in range(n+1,len(rsi_series),1):
        #print(rsi_series["Average Upword Movement"][i-1],rsi_series["Upword Movement"][i])
        rsi_series["Average Upword Movement"][i] = (rsi_series["Average Upword Movement"][i-1]*(n-1)+rsi_series["Upword Movement"][i])/n

    #Average Downword Movement
    # For first Downword Movement Mean of first n elements.
    rsi_series["Average Downword Movement"] = 0.00
    rsi_series["Average Downword Movement"][n] = rsi_series["Downword Movement"][1:n+1].mean()

    # For Second onwords
    for i in range(n+1,len(rsi_series),1):
        #print(rsi_series["Average Downword Movement"][i-1],rsi_series["Downword Movement"][i])
        rsi_series["Average Downword Movement"][i] = (rsi_series["Average Downword Movement"][i-1]*(n-1)+rsi_series["Downword Movement"][i])/n

    #Relative Index
    rsi_series["Relative Strength"] = (rsi_series["Average Upword Movement"]/rsi_series["Average Downword Movement"]).fillna(0)

    #RSI
    rsi_series["RSI"] = 100 - 100/(rsi_series["Relative Strength"]+1)
    return rsi_series.round(2)     

For More Information

答案 9 :(得分:0)

您也可以使用finta软件包进行此操作

ref:https://github.com/peerchemist/finta/tree/master/examples

import pandas as pd
from finta import TA
import matplotlib.pyplot as plt

ohlc = pd.read_csv("C:\\WorkSpace\\Python\\ta-lib\\intraday_5min_IBM.csv", index_col="timestamp", parse_dates=True)
ohlc['RSI']= TA.RSI(ohlc)

答案 10 :(得分:0)

实际上没有必要计算平均值,因为将它们相除后,您只需要计算和即可,因此我们可以使用Series.cumsum ...

def rsi(serie, n):

    diff_serie = close.diff()
    cumsum_incr = diff_serie.where(lambda x: x.gt(0), 0).cumsum()
    cumsum_decr = diff_serie.where(lambda x: x.lt(0), 0).abs().cumsum()
    rs_serie = cumsum_incr.div(cumsum_decr)
    rsi = rs_serie.mul(100).div(rs_serie.add(1)).fillna(0)

    return rsi

答案 11 :(得分:-1)

您也可以使用以下内容。 If语句将确保第一个RSI值与其余值的计算方式不同(正确)。最后,所有NaN值都将替换为空白。

这假设您已经导入了熊猫并且数据框为df。唯一需要的其他数据是收盘价列,其标签为收盘价。您可以将此列引用为df.Close,但是,有时您可能会有多个带有空格分隔符的单词作为列标题(这需要df ['word1 word2']格式)。按照惯例,我始终使用df ['Close']格式。

import numpy as np

# Calculate change in closing prices day over day
df['Delta'] = df['Close'].diff(periods=1, axis=0)

# Calculate if difference in close is Gain
conditions = [df['Delta'] <= 0, df['Delta'] > 0]
choices = [0, df['Delta']]
df['ClGain'] = np.select(conditions, choices)

# Calculate if difference in close is Loss
conditions = [df['Delta'] >= 0, df['Delta'] < 0]
choices = [0, -df['Delta']]
df['ClLoss'] = np.select(conditions, choices)

# Determine periods to calculate RSI over
rsi_n = 9

# Calculate Avg Gain over n periods
conditions = [df.index < rsi_n, df.index == rsi_n, df.index > rsi_n]
choices = ["", df['ClGain'].rolling(rsi_n).mean(), ((df['AvgGain'].shift(1) * (rsi_n - 1)) + df['ClGain']) / rsi_n]
df['AvgGain'] = np.select(conditions, choices)

# Calculate Avg Loss over n periods
conditions = [df.index < rsi_n, df.index == rsi_n, df.index > rsi_n]
choices = ["", df['ClLoss'].rolling(rsi_n).mean(), ((df['AvgLoss'].shift(1) * (rsi_n - 1)) + df['ClLoss']) / rsi_n]
df['AvgLoss'] = np.select(conditions, choices)

# Calculate RSI
df['RSI'] = 100-(100 / (1 + (df['AvgGain'] / df['AvgLoss'])))

# Replace NaN cells with blanks
df = df.replace(np.nan, "", regex=True)

# (OPTIONAL) Remove columns used to create RSI
del df['Delta']
del df['ClGain']
del df['ClLoss']
del df['AvgGain']
del df['AvgLoss']