Question

我试图估计某些样本假设它们遵循多元高斯分布的概率。我首先使用我的数据集来估计协方差矩阵和numpy的均值向量，然后我使用scipy来获得一些样本的概率值。

我面临的问题是我获得的概率值远高于1.以下是代码：

import string
import numpy as np
from scipy.stats import multivariate_normal

# sample_list contains my samples
x = np.array(sample_list).T
cov = np.cov(x)
mean_vec = np.mean(x, axis=1)
print "Covariance"
print cov
print "Mean"
print mean_vec
print "Determinant"
print np.linalg.det(cov)

rv = multivariate_normal(mean=mean_vec, cov=cov)
x = np.random.multivariate_normal(mean_vec, cov)
print "Random sample"
print x
print "pdf(x)"
print rv.pdf(x)

以下是一个结果示例：

Covariance
[[ 0.00114548  0.00126964  0.00163885 -0.00097021]
 [ 0.00126964  0.00147635  0.00181914 -0.00051609]
 [ 0.00163885  0.00181914  0.0023556  -0.00165483]
 [-0.00097021 -0.00051609 -0.00165483  0.01451006]]
Mean
[ 0.04382192  0.05937116  0.05526359  0.36803589]
Determinant
1.23897068383e-15
Random sample
[ 0.04897992  0.0540464   0.06269819  0.30528633]
pdf(x)
150525.528322

如您所见，pdf的价值很高。我在代码中做错了吗？或许也有一些我不理解的数学知识。

谢谢！

具有numpy / scipy的多变量高斯概率高于1

0 个答案: