我正在考虑将Johnson SU分布拟合到一套经验性的S& P 500指数回报中。我的理解(免责声明:不是数学家)是这个分布包含第三和第四个时刻(倾斜和峰度)。除loc
(平均值)和scale
(标准偏差)外,johnsonsu
还需要另外两个参数a
和b
。但这些参数的顺序和规格令人困惑。
这就是我的困惑所在:如果我回到SPDR S& P 500 ETF Trust(SPY),我会得到以下经验统计数据:
from pandas_datareader.data import DataReader as dr
r = dr('SPY', 'google', start='2000')['Close'].pct_change().dropna()
mean, var, std, skew, kurt = r.mean(), r.var(0), r.std(0), r.skew(), r.kurt() # ddof = 0
# mean: 0.00027732907268771364
# var: 0.00014416720067485022
# std: 0.012006964673673785
现在,如果我符合此经验数据的正态分布,.fit
应该返回loc
和scale
参数。 (正常分发所需的一切。)检查出来:
import scipy.stats as scs
normmean, normstd = scs.norm.fit(r)
print(np.allclose(normmean, mean))
print(np.allclose(normstd, std))
True
True
但scs.johnsonsu.fit
:
print(scs.johnsonsu.fit(r))
(0.098009661042083682, 1.022060362199493, 0.0013471690867203458, 0.0072653444313926403)
这些应该是分布的four parameters: xi,gamma,delta,lam 。
I.e。:
def johnsonmean(gamma, xi, delta, lam):
mean = xi - lam * np.exp(delta ** -2 / 2) * np.sinh(gamma / delta)
return mean
gamma, xi, delta, lam = scs.johnsonsu.fit(r) # correct order?
print(johnsonmean(gamma, xi, delta, lam))
-inf
和
mean, var, skew, kurt = scs.johnsonsu.stats(loc=xi, scale=lam,
a=gamma, b=delta, moments='msvk')
获得了一堆NaN
s。
答案 0 :(得分:2)
它们是Johnson SU的参数。记住,你得到的样本的平均值与分布的平均值不同。这是平均值的表达式
这里是方差的表达式:
在您的代码中,ξ为Set wApp = CreateObject("Word.Application")
wApp.Visible = True
Set wDoc = wApp.Documents.Open(filename:=ThisWorkbook.path & "\TestAccount.docx")
With wDoc
.Shapes("InvoiceXLS").OLEFormat.Edit ' ???
' How do I get a Worksheet object that I can work with??
' Or, just paste in a whole table over top?
End With
,λ为loc
,γ为scale
,δ为a
。 sinh -1 (x)等于log(x + sqrt(1 + x 2 ))。
因此,检查拟合的返回值,为所有四个参数赋值,然后计算分布均值并与样本均值进行比较。如果有效,请重复练习以获得差异
更新
我尝试了您的代码,建议检查均值和方差,但效果很好,请查看下面的
b
并产生了输出:
import sys
import math
from pandas_datareader.data import DataReader as dr
import scipy.stats as scs
def read_data():
return dr('SPY', 'google', start='2000')['Close'].pct_change().dropna()
def johnsonsu_mean(a, b, loc, scale):
"""
Johnson SU mean according to https://en.wikipedia.org/wiki/Johnson%27s_SU-distribution
"""
v = loc - scale * math.exp(0.5 / b**2) * math.sinh(a/b)
return v
def johnsonsu_var(a, b, loc, scale):
"""
Johnson SU variance according to https://en.wikipedia.org/wiki/Johnson%27s_SU-distribution
"""
t = math.exp(1.0 / b**2)
v = 0.5*scale**2 * (t - 1.0) * (t * math.cosh(2.0*a/b) + 1.0)
return v
def johnsonsu_median(a, b, loc, scale):
"""
Johnson SU median according to https://en.wikipedia.org/wiki/Johnson%27s_SU-distribution
"""
v = loc + scale * math.sinh(-a/b)
return v
def main(r):
sample_mean, sample_med, sample_var, sample_std, sample_skew, sample_kurt = r.mean(), r.median(), r.var(0), r.std(0), r.skew(), r.kurt()
a, b, loc, scale = scs.johnsonsu.fit(r) # fit the data and get distribution parameters back
# distribution mean and variance according to SciPy
dist_mean = scs.johnsonsu.mean(a, b, loc, scale)
dist_med = scs.johnsonsu.median(a, b, loc, scale)
dist_var = scs.johnsonsu.var(a, b, loc, scale)
# distribution mean, var vs sample ones
print("{0} {1}".format(sample_mean, dist_mean))
print("{0} {1}".format(sample_med, dist_med))
print("{0} {1}".format(sample_var, dist_var))
print("")
# distribution mean and variance according to Wiki vs SciPy
print("{0} {1}".format(dist_mean, johnsonsu_mean(a, b, loc, scale)))
print("{0} {1}".format(dist_var, johnsonsu_var(a, b, loc, scale)))
print("{0} {1}".format(dist_med, johnsonsu_median(a, b, loc, scale)))
if __name__ == "__main__":
r = read_data()
main(r)
sys.exit(0)