Question

我有这个pandas数据框，其中long_entry或short_entry中的1代表当时对应的多头/空头头寸进入交易。 long_exit或short_exit中的1表示退出交易。我可以知道如何计算要显示在新列df ['pnl_per_trade']中的每笔交易的PnL吗？

此回测在任何时间点最多只能进行1个交易/头寸。

下面是我的数据框。如我们所见，在26/2/2019输入多头交易并在1/3/2019关闭，Pnl将为$ 64.45，而在4/3/2019输入空头交易并在2019/5/3关闭PNL为-119.11美元（亏损）。

        date    price       long_entry  long_exit   short_entry short_exit
0   24/2/2019   4124.25           0          0           0              0
1   25/2/2019   4130.67           0          0           0              0
2   26/2/2019   4145.67           1          0           0              0
3   27/2/2019   4180.10           0          0           0              0
4   28/2/2019   4200.05           0          0           0              0
5   1/3/2019    4210.12           0          1           0              0
6   2/3/2019    4198.10           0          0           0              0
7   3/3/2019    4210.34           0          0           0              0
8   4/3/2019    4100.12           0          0           1              0
9   5/3/2019    4219.23           0          0           0              1

我希望有这样的输出：

        date    price       long_entry  long_exit   short_entry short_exit  pnl
0   24/2/2019   4124.25           0          0           0             0    NaN
1   25/2/2019   4130.67           0          0           0             0    NaN
2   26/2/2019   4145.67           1          0           0             0  64.45
3   27/2/2019   4180.10           0          0           0             0    NaN
4   28/2/2019   4200.05           0          0           0             0    NaN
5   1/3/2019    4210.12           0          1           0             0    NaN
6   2/3/2019    4198.10           0          0           0             0    NaN
7   3/3/2019    4210.34           0          0           0             0    NaN
8   4/3/2019    4100.12           0          0           1             0 -119.11
9   5/3/2019    4219.23           0          0           0             1    NaN

由于我有很多数据，所以我希望代码尽可能避免任何循环。谢谢！

Answer 1

我将您的样本数据扩展为具有2个长PnL值，并且将 date 列更改为 DateTime ：

df = pd.DataFrame(data=[
    [ '24/2/2019', 4124.25, 0, 0, 0, 0 ],
    [ '25/2/2019', 4130.67, 0, 0, 0, 0 ],
    [ '26/2/2019', 4145.67, 1, 0, 0, 0 ],
    [ '27/2/2019', 4180.10, 0, 0, 0, 0 ],
    [ '28/2/2019', 4200.05, 0, 0, 0, 0 ],
    [ '1/3/2019',  4210.12, 0, 1, 0, 0 ],
    [ '2/3/2019',  4198.10, 0, 0, 0, 0 ],
    [ '3/3/2019',  4210.34, 0, 0, 0, 0 ],
    [ '4/3/2019',  4100.12, 0, 0, 1, 0 ],
    [ '5/3/2019',  4219.23, 0, 0, 0, 1 ],
    [ '6/3/2019',  4210.00, 1, 0, 0, 0 ],
    [ '7/3/2019',  4212.00, 0, 0, 0, 0 ],
    [ '8/3/2019',  4214.00, 0, 1, 0, 0 ]],
    columns=['date','price', 'long_entry', 'long_exit',
        'short_entry', 'short_exit'])
df.date = pd.to_datetime(df.date)

下一步是生成df2，其中仅包含用于长条目的开始和结束（实际上只有 date 和 price 列将是必需的，但出于说明目的，我将还有 long_entry 和 long_exit ：

df2 = df.query('long_entry > 0 or long_exit > 0').iloc[:,0:4]; df2

（用于我的数据）结果是：

         date    price  long_entry  long_exit
2  2019-02-26  4145.67           1          0
5  2019-01-03  4210.12           0          1
10 2019-06-03  4210.00           1          0
12 2019-08-03  4214.00           0          1

然后我们必须定义一个即将应用的函数：

def fn(src):
    return pd.Series([src.iloc[0, 0], src.iloc[1, 1] - src.iloc[0, 1]])

下一步是将上述功能应用于连续对行（进入和退出），设置列名并更改日期索引列：

lProf = df2.groupby(np.arange( len(df2.index)) // 2).apply(fn)
lProf.columns = ['date', 'pnl']
lProf.set_index('date', inplace=True)

结果是：

             pnl
date             
2019-02-26  64.45
2019-06-03   4.00

到目前为止，我们已经可以从 long 个条目中插入数据。现在是时候为 short 条目生成类似的DataFrame了，应用与以前相同的功能：

df2 = df.query('short_entry > 0 or short_exit > 0').iloc[:,[0, 1, 4, 5]]
sProf = df2.groupby(np.arange( len(df2.index)) // 2).apply(fn)
sProf.columns = ['date', 'pnl']
sProf.set_index('date', inplace=True)

但是这次我们必须更改接收值的符号：

sProf = -sProf

结果是：

               pnl
date              
2019-04-03 -119.11

在将结果添加到主DataFrame之前，我们必须设置 date 列作为索引：

df.set_index('date', inplace=True)

现在，我们添加 long 条目的结果：

df['pnl'] = lProf

这已经创建了新列，因此现在要添加结果短条目中，我们必须进行更新：

df.update(sProf)

如果要将 date 重新作为常规列，请运行：

df.reset_index(inplace=True)

Answer 2

我不确定这是否有帮助：但是我认为您关于PnL的概念可能不正确。下面显示了如何获取每日pnl值而不是头寸的pnl。

def get_position(long_entry,long_exit, short_entry,short_exit):
    if long_entry == 1 or short_exit == 1:
        position = 1
    elif long_exit == 1 or short_entry == 1:
        position = -1
    else:
        position = 0

    return position

df['position'] = list(map(get_position, df.long_entry.values, df.long_exit.values, df.short_entry.values, df.short_exit.values))

df = df[['date', 'price','position']]

df['amount'] = -df['price']*df['position']
df['pnl'] = df['amount'].cumsum()

这是结果：

        date    price  position   amount      pnl
0  24/2/2019  4124.25         0    -0.00    -0.00
1  25/2/2019  4130.67         0    -0.00    -0.00
2  26/2/2019  4145.67         1 -4145.67 -4145.67
3  27/2/2019  4180.10         0    -0.00 -4145.67
4  28/2/2019  4200.05         0    -0.00 -4145.67
5   1/3/2019  4210.12        -1  4210.12    64.45
6   2/3/2019  4198.10         0    -0.00    64.45
7   3/3/2019  4210.34         0    -0.00    64.45
8   4/3/2019  4100.12        -1  4100.12  4164.57
9   5/3/2019  4219.23         1 -4219.23   -54.66

这是累积多头头寸，与多头或空头无关。希望对您有所帮助。

使用Pandas Dataframe计算每笔交易的盈亏（PnL）

2 个答案: