我想将Dirichlet process mixtures for density estimation上的Austin案例扩展到多变量案例。
我发现使用pymc3的多变量高斯混合物的第一个信息是issue at Github。参与该问题的人说,有两种不同的解决方案,但它们对我不起作用。例如,通过在这样的简单模型中使用Brandon的多变量扩展:
import numpy as np
import pymc3 as pm
from mvnormal_extension import MvNormal
with pm.Model() as model:
var_x = MvNormal('var_x', mu = 3*np.zeros(2), tau = np.diag(np.ones(2)), shape=2)
trace = pm.sample(100)
我无法在(3,3)左右获得正确的均值:
pm.summary(trace)
var_x:
Mean SD MC Error 95% HPD interval
-------------------------------------------------------------------
0.220 1.161 0.116 [-1.897, 2.245]
0.165 1.024 0.102 [-2.626, 1.948]
Posterior quantiles:
2.5 25 50 75 97.5
|--------------|==============|==============|--------------|
-1.897 -0.761 0.486 1.112 2.245
-2.295 -0.426 0.178 0.681 2.634
感谢贝纳文特,可以复制另一种解决方案:
import numpy as np
import pymc3 as pm
import scipy
import theano
from theano import tensor
target_data = np.random.random((500, 16))
N_COMPONENTS = 5
N_SAMPLES, N_DIMS = target_data.shape
# Dirichilet prior.
ALPHA_0 = np.ones(N_COMPONENTS)
# Component means prior.
MU_0 = np.zeros(N_DIMS)
LAMB_0 = 1. * np.eye(N_DIMS)
# Components precision prior.
BETA_0, BETA_1 = 0., 1. # Covariance stds prior uniform limits.
L_0 = 2. # LKJ corr. shape. Larger shape -> more biased to identity.
# In order to convert the upper triangular correlation values to a
# complete correlation matrix, we need to construct an index matrix:
# Source: http://stackoverflow.com/q/29759789/1901296
N_ELEMS = N_DIMS * (N_DIMS - 1) / 2
tri_index = np.zeros([N_DIMS, N_DIMS], dtype=int)
tri_index[np.triu_indices(N_DIMS, k=1)] = np.arange(N_ELEMS)
tri_index[np.triu_indices(N_DIMS, k=1)[::-1]] = np.arange(N_ELEMS)
with pm.Model() as model:
# Component weight prior.
pi = pm.Dirichlet('pi', ALPHA_0, testval=np.ones(N_COMPONENTS) / N_COMPONENTS)
#pi_potential = pm.Potential('pi_potential', tensor.switch(tensor.min(pi) < .01, -np.inf, 0))
###################
# Components plate.
###################
# Component means.
mus = [pm.MvNormal('mu_{}'.format(i), MU_0, LAMB_0, shape=N_DIMS)
for i in range(N_COMPONENTS)]
# Component precisions.
#lamb = diag(sigma) * corr(corr_shape) * diag(sigma)
corr_vecs = [
pm.LKJCorr('corr_vec_{}'.format(i), L_0, N_DIMS)
for i in range(N_COMPONENTS)
]
# Transform the correlation vector representations to matrices.
corrs = [
tensor.fill_diagonal(corr_vecs[i][tri_index], 1.)
for i in range(N_COMPONENTS)
]
# Stds for the correlation matrices.
cov_stds = pm.Uniform('cov_stds', BETA_0, BETA_1, shape=(N_COMPONENTS, N_DIMS))
# Finally re-compose the covariance matrices using diag(sigma) * corr * diag(sigma)
# Source http://austinrochford.com/posts/2015-09-16-mvn-pymc3-lkj.html
lambs = []
for i in range(N_COMPONENTS):
std_diag = tensor.diag(cov_stds[i])
cov = std_diag.dot(corrs[i]).dot(std_diag)
lambs.append(tensor.nlinalg.matrix_inverse(cov))
stacked_mus = tensor.stack(mus)
stacked_lambs = tensor.stack(lambs)
#####################
# Observations plate.
#####################
z = pm.Categorical('z', pi, shape=N_SAMPLES)
@theano.as_op(itypes=[tensor.dmatrix, tensor.lvector, tensor.dmatrix, tensor.dtensor3],
otypes=[tensor.dscalar])
def likelihood_op(values, z_values, mu_values, prec_values):
logp = 0.
for i in range(N_COMPONENTS):
indices = z_values == i
if not indices.any():
continue
logp += scipy.stats.multivariate_normal(
mu_values[i], prec_values[i]).logpdf(values[indices]).sum()
return logp
def likelihood(values):
return likelihood_op(values, z, stacked_mus, stacked_lambs)
y = pm.DensityDist('y', likelihood, observed=target_data)
step1 = pm.Metropolis(vars=mus + lambs + [pi])
step2 = pm.ElemwiseCategoricalStep(vars=[z], values=list(range(N_COMPONENTS)))
trace = pm.sample(100, step=[step1, step2])
我已将此代码pm.ElemwiseCategoricalStep
更改为pm.ElemwiseCategorical
和
logp += scipy.stats.multivariate_normal(mu_values[i], prec_values[i]).logpdf(values[indices])
通过
logp += scipy.stats.multivariate_normal(mu_values[i], prec_values[i]).logpdf(values[indices]).sum()
但我得到了这个例外:
ValueError: expected an ndarray
Apply node that caused the error: Elemwise{Composite{((i0 + i1) - (i2 + i3))}}[(0, 0)](Sum{acc_dtype=float64}.0, FromFunctionOp{likelihood_op}.0, Sum{acc_dtype=float64}.0, FromFunctionOp{likelihood_op}.0)
Toposort index: 101
Inputs types: [TensorType(float64, scalar), TensorType(float64, scalar), TensorType(float64, scalar), TensorType(float64, scalar)]
Inputs shapes: [(), (), (), ()]
Inputs strides: [(), (), (), ()]
Inputs values: [array(-127.70516572917249), -13460.012199423296, array(-110.90354888959129), -13234.61313535326]
Outputs clients: [['output']]
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
我感谢任何帮助。 谢谢!