我正在尝试根据到期和罢工将DataFrame分组以下。 在那之后,我想计算所有行使价和到期日相同的看涨期权和看跌期权之间的差额。在下面的示例中,只有第1行和第2行会产生结果(15.370001-1.495 =)= 13.875
在不编写for循环的情况下我该如何继续?我想到了以下事情:
df.groupby(["Expiration","Strike"]).agg(lambda x: x[x.Type == "call"].Price - x[x.Type == "put"].Price + x.Strike)
但是,我不确定如何将(类型等于调用)参数传递给groupby函数?
Type Price Expiration Strike
0 put 145.000000 2021-01-15 420.0
1 call 15.370001 2018-11-30 262.0
2 put 1.495000 2018-11-30 262.0
3 call 14.930000 2018-11-30 262.5
答案 0 :(得分:1)
您可以将GroupBy.apply
与next
和iter
一起使用自定义函数来获取第一个值,如果不匹配则获取NaN
:
def f(x):
c = next(iter(x.loc[x.Type == "call", 'Price']),np.nan)
p = next(iter(x.loc[x.Type == "put", 'Price']),np.nan)
x['new']= c - p + x.Strike
return x
df = df.groupby(["Expiration","Strike"]).apply(f)
print (df)
Type Price Expiration Strike new
0 put 145.000000 2021-01-15 420.0 NaN
1 call 15.370001 2018-11-30 262.0 275.875001
2 put 1.495000 2018-11-30 262.0 275.875001
3 call 14.930000 2018-11-30 262.5 NaN
另一种解决方案:
#if possible `call` and `put` are not unique per groups
c = df[df.Type == "call"].groupby(["Expiration","Strike"])['Price'].first()
p = df[df.Type == "put"].groupby(["Expiration","Strike"])['Price'].first()
#if `call` and `put` are unique per groups
#c = df[df.Type == "call"].set_index(["Expiration","Strike"])['Price']
#p = df[df.Type == "put"].set_index(["Expiration","Strike"])['Price']
df1 = df.join((c - p).rename('new'), on=["Expiration","Strike"])
df1['new'] += df1['Strike']
print (df1)
Type Price Expiration Strike new
0 put 145.000000 2021-01-15 420.0 NaN
1 call 15.370001 2018-11-30 262.0 275.875001
2 put 1.495000 2018-11-30 262.0 275.875001
3 call 14.930000 2018-11-30 262.5 NaN