如何在熊猫多索引数据框中逐行计算列值?

时间:2018-10-10 09:59:08

标签: pandas multi-index

我有两个数据帧:the_strongest_stockdata。它们具有相同的多索引:DateSymbolthe_strongest_stock有一个相关的列:Entry_signaldata具有列OpenHighLowCloseATRNo_of_data列。我将the_strongest_stock加入data并记录职位。

# Entry signal---------------------------------------------------------------

data = data.join(the_strongest_stock[['Entry_signal']], on=['Date','Symbol'])
data.loc[data['No_of_data'] < 250,'Entry_signal'] = False
data['Entry_signal'] = data['Entry_signal'].fillna(False)

# Position log---------------------------------------------------------------

data['Initial_stop'] = (data['Close'] - 10 * data['ATR']).round(2)
data['Stop_level'] = data.groupby(['Symbol',data.groupby('Symbol')['Entry_signal'].cumsum()])['Initial_stop'].cummax()
data['Exit_signal'] = np.where(data['Low'] < data.groupby('Symbol')['Stop_level'].shift(1),True,False)
data['Position'] = np.where(data['Exit_signal'] == True,False,np.where(data['Entry_signal'] == True,True,np.nan))
data['Position'] = data.groupby('Symbol')['Position'].ffill()
data.loc[data['Position'] == False,'Stop_level'] = False

现在,我有了汇总的数据帧data和记录的Position,我创建了列来计算交易产生的浮动利润和平仓利润。

 # Shares---------------------------------------------------------------

starting_capital = 30000

risk_base = starting_capital
trade_base = starting_capital

close_stop_diff = np.maximum(data.groupby('Symbol')['Close'].shift(1) - data.groupby('Symbol')['Stop_level'].shift(1),0.000001) # to avoid division by zero in share_number
share_number = (risk_base * risk_per_position) / close_stop_diff

data['Shares'] = np.where(
   data['Position'] == False,
   0,
   np.where(
      np.logical_and(
         data.groupby('Symbol')['Position'].shift(1) == False,
         data['Position'] == True),
      np.where(
         share_number * data.groupby('Symbol')['Close'].shift(1) > trade_base + 0.05 * risk_base,
         0,
         share_number),
      np.nan))
data = data.sort_values(by=['Date', 'Symbol'])
data['Shares'] = data.groupby('Symbol')['Shares'].ffill()
data['Shares'] = data['Shares'].astype(float).round(0)

# Open price---------------------------------------------------------------

data.loc[data['Position'] == False, 'Open_price'] = False
data['Open_price'] = np.where(
   data['Exit_signal'] == True,
   False,
   np.where(
      np.logical_and(
         data.groupby('Symbol')['Position'].shift(1) == False,
         data['Position'] == True),
      data['Open'] + skid * (data['High'] - data['Open']),
      np.nan))
data = data.sort_values(by=['Date', 'Symbol'])
data['Open_price'] = data.groupby('Symbol')['Open_price'].ffill()
data['Open_price'] = data['Open_price'].round(2)

# Open value---------------------------------------------------------------

data['Open_value'] = np.where(
   data['Position'] == True,
   data['Shares'] * data['Open_price'],
   0)

# Floating profit---------------------------------------------------------------

data['Floating_P/L'] = np.where(
   data['Position'] == True,
   data['Shares'] * (data['Close'] - data['Open_price']),
   0)
data['Floating_P/L'] = data['Floating_P/L'].fillna(0)
data['Floating_P/L'] = data['Floating_P/L'].round(2)

# Close price---------------------------------------------------------------

data['Close_price'] = np.where(
   np.logical_and(
      data.groupby('Symbol')['Position'].shift(1) == True,
      data['Position'] == False),
   data.groupby('Symbol')['Stop_level'].shift(1) - skid * (data.groupby('Symbol')['Stop_level'].shift(1) - data['Low']),
   False)
data['Close_price'] = data['Close_price'].astype(float).round(2)

# Closed profit---------------------------------------------------------------

data['Closed_P/L'] = np.where(
   np.logical_and(
      data.groupby('Symbol')['Position'].shift(1) == True,
      data['Position'] == False),
   data.groupby('Symbol')['Shares'].shift(1) * (data['Close_price'] - data.groupby('Symbol')['Open_price'].shift(1)),
   0)
data['Closed_P/L'] = data['Closed_P/L'].fillna(0)
data['Closed_P/L'] = data['Closed_P/L'].round(2)

最后,我创建列以汇总Date产生的每个Symbol的盈亏

# Totals---------------------------------------------------------------

data['Total_Open_Value'] = data.groupby(by=['Date','Symbol'])['Open_value'].sum().groupby(level=[0]).cumsum().groupby('Date').tail(1)
data['Total_Open_Value'] = data.groupby('Date')['Total_Open_Value'].bfill()

data['Floating_Total'] = data.groupby(by=['Date','Symbol'])['Floating_P/L'].sum().groupby(level=[0]).cumsum().groupby('Date').tail(1)
data['Floating_Total'] = data.groupby('Date')['Floating_Total'].bfill()

data['Closed_Total'] = data.groupby(by=['Date','Symbol'])['Closed_P/L'].sum().groupby(level=[0]).cumsum().groupby('Date').tail(1).cumsum()
data['Closed_Total'] = data.groupby('Date')['Closed_Total'].bfill()

data['Closed_Balance'] = starting_capital + data['Closed_Total'] - data['Total_Open_Value']

data['Equity'] = data['Closed_Balance'] + data['Floating_Total'] + data['Total_Open_Value']

我的问题是,这种方式starting_capital是一个常数(中间代码部分,第一行代码),我想为给定的Share计算Date列的值基于先前的Date的{​​{1}}和Equity。我试图在代码末尾为用于计算Closed_Balance的相关变量重新分配新内容,如下所示:

Shares

,但这不会动态更改risk_base = data.groupby('Date')['Equity'].tail(1).shift(1) trade_base = data.groupby('Date')['Closed_Balance'].tail(1).shift(1) 列的值。我想让Shares列知道从第二个Shares索引开始,它应该使用先前的Date的{​​{1}}和Date值进行计算。在第一个Equity索引处,它应使用变量Closed_Balance进行计算,该变量是Datestarting_capital的初始值。我该如何进行这项工作?我知道Equity / Closed_Balance是根据Equity计算的,而Closed_Balance是根据Shares / Shares计算的,但后者应该在Equity / Closed_Balance中将一个Date索引“上移”,它们也具有初始值。因此逻辑应该可以,不是吗?我现在在这个问题上停留了几天,如果有人可以提供帮助,那就太好了。谢谢!

0 个答案:

没有答案