我想计算多元正态分布的对数似然。
数据:
data = np.random.multivariate_normal(mean=[2,5], cov=[[1, 0], [0, 10]], size=1000)
可能性(我跟着Wikipedia):
def likelihood(mean, cov):
# mean = np.array([y1, y2]), cov = np.array([[c1, 0], [0, c2]])
loglikelihood = -0.5*( np.log(np.linalg.det(cov)) + (data - mean).transpose() * np.linalg.inv(cov) * (data - mean) + 2 * np.log(2 * np.pi) )
loglikelihoodsum = loglikelihood.sum()
return loglikelihoodsum
它返回以下错误:
> likelihood(mean, cov)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-54-3b20c2eefea4> in <module>()
----> 1 likelihood(mean, cov)
<ipython-input-53-8a2a7219131c> in likelihood(mean, cov)
2 # param = mean[y1, y2], cov = [[c1, 0], [0, c2]]
3
----> 4 loglikelihood = -0.5*( np.log(np.linalg.det(cov)) + (data - mean).transpose() * np.linalg.inv(cov) * (data - mean) + 2 * np.log(2 * np.pi) )
5
6 loglikelihoodsum = loglikelihood.sum()
ValueError: operands could not be broadcast together with shapes (2,1000) (2,2)
我该如何解决?
答案 0 :(得分:0)
与*
中的numpy
运算符相乘是指元素乘法。您想使用np.einsum
来计算内部产品:
mean = np.random.normal(0, 1, 2)
cov = np.random.normal(0, 1, (2, 2))
data = np.random.normal(0, 1, (1000, 2))
residuals = data - mean
loglikelihood = -0.5 * (
np.log(np.linalg.det(cov))
+ np.einsum('...j,jk,...k', residuals, np.linalg.inv(cov), residuals)
+ len(mean) * np.log(2 * np.pi)
)
np.sum(loglikelihood)
答案 1 :(得分:0)
我找到答案:
def likelihood(mean, cov): # Wikipedia
def calc_loglikelihood(residuals):
return -0.5 * (np.log(np.linalg.det(cov)) + residuals.T.dot(np.linalg.inv(cov)).dot(residuals) + 2 * np.log(2 * np.pi))
# mean = np.array([y1, y2]), cov = np.array([[c1, 0], [0, c2]])
residuals = (data - mean)
loglikelihood = np.apply_along_axis(calc_loglikelihood, 1, residuals)
loglikelihoodsum = loglikelihood.sum()
return loglikelihoodsum
答案 2 :(得分:0)
您可以使用 scipy.stats.multivariate_normal.logpdf