Question

我目前正在通过this SciPy example处理Kernal Estimation。特别是标有＆＃34;单变量估计＆＃34;的那个。与创建随机数据相反，我使用的是资产回报。我的第二个估计（甚至是我创建的简单标准pdf）显示密度达到20，这没有任何意义......我的代码如下：

x1 = np.array(data['actual'].values)[1:]
xs1 = np.linspace(x1.min()-1,x1.max()+1,len(x1))
std1 = x1.std()
mean1 = x1.mean()

x2 = np.array(data['log_moves'].values)[1:]
xs2 = np.linspace(x2.min()-.01,x2.max()+.01,len(x2))
#xs2 = np.linspace(x2.min()-1,x2.max()+2,len(x2))
std2 = x2.std()
mean2 = x2.mean()

kde1 = stats.gaussian_kde(x1)  # actuals
kde2 = stats.gaussian_kde(x1, bw_method='silverman')

kde3 = stats.gaussian_kde(x2)  # log returns
kde4 = stats.gaussian_kde(x2, bw_method='silverman')

fig = plt.figure(figsize=(10,8))
ax1 = fig.add_subplot(211)
ax1.plot(x1, np.zeros(x1.shape), 'b+', ms=12)  # rug plot
ax1.plot(xs1, kde1(xs1), 'k-', label="Scott's Rule")
ax1.plot(xs1, kde2(xs1), 'b-', label="Silverman's Rule")
ax1.plot(xs1, stats.norm.pdf(xs1,mean1,std1), 'r--', label="Normal PDF")

ax1.set_xlabel('x')
ax1.set_ylabel('Density')
ax1.set_title("Absolute (top) and Returns (bottom) distributions")
ax1.legend(loc=1)

ax2 = fig.add_subplot(212)
ax2.plot(x2, np.zeros(x2.shape), 'b+', ms=12)  # rug plot
ax2.plot(xs2, kde3(xs2), 'k-', label="Scott's Rule")
ax2.plot(xs2, kde4(xs2), 'b-', label="Silverman's Rule")
ax2.plot(xs2, stats.norm.pdf(xs2,mean2,std2), 'r--', label="Normal PDF")

ax2.set_xlabel('x')
ax2.set_ylabel('Density')

plt.show()

我的结果：

作为参考，第一和第二时刻的数据：

print std1
print mean1
print std2 
print mean2
4.66416718334
0.0561365678347
0.0219996729055
0.00027330546845

此外，如果我更改第二个图表以生成对数正态PDF，我得到一条扁平线（如果Y轴像顶部一样正确缩放，我确定会显示像我这样的分布＃＆＃ 39; d expect）

Answer 1

核密度估计的结果是概率密度。虽然概率不能大于1，但密度可以。

给定概率密度曲线，您可以通过积分该范围内的概率密度来找到范围(x_1, x_2)内的概率。通过眼睛判断，两条曲线下的积分大约为1，因此输出看起来是正确的。

Python - SciPy Kernal估计示例 - 密度＆gt;＆gt; 1

1 个答案: