如何创建新的pandas列scipy.optimize返回值

时间:2017-06-11 13:52:35

标签: python pandas scipy

如何使用scipy函数返回值在pandas.DataFfame中创建新列? scipy.optimize函数调用另一个函数来确定值。我能够打印返回的值,验证功能,但我无法将返回的值存储在新的pandas列中。

# import packages
import pandas as pd
from math import sqrt, log, exp
from scipy.stats import norm
from scipy import optimize

# define variables
tradingMinutesDay = 390.0
tradingMinutesAnnum = 98280.0

# create pandas.DataFrame
df = pd.DataFrame.from_dict({'CP': [1, -1, 1, -1],
 'M': [1.705, 1.305, 2.45, 1.995],
 'RF': [0.008671, 0.008671, 0.009290, 0.009290],
 'K': [60.0, 60.0, 60.0, 60.0],
 'T': [33.0, 33.0, 53.0, 53.0],
 'S': [60.4, 60.4, 60.4, 60.4]})

 # def function
 def find_sigma2(sigma, mark, cp, S, K, dte, rf):
    T = (dte * tradingMinutesDay) / tradingMinutesAnnum
    q = 0.0
    log_SK = log(S / K)
    sqrt_T = sqrt(T)
    drf = exp(-rf * T)
    dq = exp(-q*T)
    d1 = (log_SK + T * (rf - q + sigma ** 2 / 2)) / (sigma * sqrt_T)
    d2 = d1 - sigma * sqrt_T
    cdf_d1 = norm.cdf(cp * d1)
    cdf_d2 = norm.cdf(cp * d2)
    return cp * ((S * dq * cdf_d1) - (K * drf * cdf_d2)) - mark

我能够运行这些功能并打印值:

# Can print accurate values
for r in df.itertuples():
    print(optimize.brentq(find_sigma2, .0001, 10, args=(r.M, r.CP, r.S, r.K, r.T, r.RF), xtol=1.0e-4))

0.16798850071790686
0.17589393607434
0.19833696082012875
0.2040142964775614

我无法使用以下方法存储值。

# TypeError: cannot convert the series to <class 'float'>   
df['IV'] = df.apply(optimize.brentq(find_sigma2, .0001, 10, args=(df.M, df.CP, df.S, df.K, df.T, df.RF), xtol=1.0e-4), axis=1)

# AttributeError: 'Pandas' object has no attribute 'IV'
for r in df.itertuples():
    r.IV = optimize.brentq(find_sigma2, .0001, 10, args=(r.M, r.CP, r.S, r.K, r.T, r.RF), xtol=1.0e-4)

# AttributeError: can't set attribute
df['IV'] = 0
for r in df.itertuples():
    r.IV = optimize.brentq(find_sigma2, .0001, 10, args=(r.M, r.CP, r.S, r.K, r.T, r.RF), xtol=1.0e-4)

# TypeError: cannot convert the series to <class 'float'>
for i, r in df.iterrows():
    r.IV = optimize.brentq(find_sigma2, .0001, 10, args=(r.M, r.CP, r.S, r.K, r.T, r.RF), xtol=1.0e-4)

# TypeError: cannot convert the series to <class 'float'>
for i, r in df.iterrows():
    df.set_value(i, r, (optimize.brentq(find_sigma2, .0001, 10, args=(r.M, r.CP, r.S, r.K, r.T, r.RF), xtol=1.0e-4)))

预期产出:

   CP     K      M        RF     S     T        IV
0   1  60.0  1.705  0.008671  60.4  33.0  0.167989
1  -1  60.0  1.305  0.008671  60.4  33.0  0.175894
2   1  60.0  2.450  0.009290  60.4  53.0  0.198337
3  -1  60.0  1.995  0.009290  60.4  53.0  0.204014

有什么想法吗?

3 个答案:

答案 0 :(得分:3)

选项1
蛮力

iv = [
    optimize.brentq(
        find_sigma2, .0001, 10, args=(r.M, r.CP, r.S, r.K, r.T, r.RF), xtol=1.0e-4
    ) for r in df.itertuples()
]
df.assign(IV=iv)

   CP     K      M        RF     S     T        IV
0   1  60.0  1.705  0.008671  60.4  33.0  0.167989
1  -1  60.0  1.305  0.008671  60.4  33.0  0.175894
2   1  60.0  2.450  0.009290  60.4  53.0  0.198337
3  -1  60.0  1.995  0.009290  60.4  53.0  0.204014

选项2
更强力

for r in df.itertuples():
    df.set_value(
        r.Index, 'IV',
        optimize.brentq(
            find_sigma2, .0001, 10, args=(r.M, r.CP, r.S, r.K, r.T, r.RF), xtol=1.0e-4
        )
    )

df

   CP     K      M        RF     S     T        IV
0   1  60.0  1.705  0.008671  60.4  33.0  0.167989
1  -1  60.0  1.305  0.008671  60.4  33.0  0.175894
2   1  60.0  2.450  0.009290  60.4  53.0  0.198337
3  -1  60.0  1.995  0.009290  60.4  53.0  0.204014

答案 1 :(得分:2)

可能绊倒你的是名为T的列。使用.T将为您提供系列的转置,而不是名为T的元素。所以这样的事情会起作用:

代码:

def run_brentq(r):
    return optimize.brentq(
        find_sigma2, .0001, 10,
        args=(r.M, r.CP, r.S, r.K, r['T'], r.RF),
        xtol=1.0e-4)

df['IV'] = df.apply(run_brentq, axis=1)
print(df)

结果:

   CP     K      M        RF     S     T        IV
0   1  60.0  1.705  0.008671  60.4  33.0  0.167989
1  -1  60.0  1.305  0.008671  60.4  33.0  0.175894
2   1  60.0  2.450  0.009290  60.4  53.0  0.198337
3  -1  60.0  1.995  0.009290  60.4  53.0  0.204014

答案 2 :(得分:1)

我使用上面给出的答案进行了以下更改,以适应包含错误数据的记录:

def run_brentq(r):
    try:
        return optimize.brentq(
            find_sigma2, .0001, 10,
            args=(r.M, r.CP, r.S, r.K, r['T'], r.RF),
            xtol=1.0e-4)
    except:
        return 0

df['IV'] = df.apply(run_brentq, axis=1)
print(df)