使用pymc3进行密度估算的多变量Dirichlet过程混合物

时间:2016-07-20 18:01:36

标签: gaussian pymc3 mixture-model

我想将Dirichlet process mixtures for density estimation上的Austin案例扩展到多变量案例。

我发现使用pymc3的多变量高斯混合物的第一个信息是issue at Github。参与该问题的人说,有两种不同的解决方案,但它们对我不起作用。例如,通过在这样的简单模型中使用Brandon的多变量扩展:

import numpy as np
import pymc3 as pm
from mvnormal_extension  import MvNormal

with pm.Model() as model:
    var_x = MvNormal('var_x', mu = 3*np.zeros(2), tau = np.diag(np.ones(2)),  shape=2)
    trace = pm.sample(100)

我无法在(3,3)左右获得正确的均值:

pm.summary(trace)

var_x:

  Mean             SD               MC Error         95% HPD interval
  -------------------------------------------------------------------

  0.220            1.161            0.116            [-1.897, 2.245]
  0.165            1.024            0.102            [-2.626, 1.948]

  Posterior quantiles:
  2.5            25             50             75             97.5
  |--------------|==============|==============|--------------|

  -1.897         -0.761         0.486          1.112          2.245
  -2.295         -0.426         0.178          0.681          2.634

感谢贝纳文特,可以复制另一种解决方案:

import numpy as np
import pymc3 as pm
import scipy
import theano
from theano import tensor


target_data = np.random.random((500, 16))

N_COMPONENTS = 5
N_SAMPLES, N_DIMS = target_data.shape

# Dirichilet prior.
ALPHA_0 = np.ones(N_COMPONENTS)
# Component means prior.
MU_0 = np.zeros(N_DIMS)
LAMB_0 = 1. * np.eye(N_DIMS)
# Components precision prior.
BETA_0, BETA_1 = 0., 1.    # Covariance stds prior uniform limits.
L_0 = 2.       # LKJ corr. shape. Larger shape -> more biased to identity.

# In order to convert the upper triangular correlation values to a
# complete correlation matrix, we need to construct an index matrix:
# Source: http://stackoverflow.com/q/29759789/1901296
N_ELEMS = N_DIMS * (N_DIMS - 1) / 2
tri_index = np.zeros([N_DIMS, N_DIMS], dtype=int)
tri_index[np.triu_indices(N_DIMS, k=1)] = np.arange(N_ELEMS)
tri_index[np.triu_indices(N_DIMS, k=1)[::-1]] = np.arange(N_ELEMS)

with pm.Model() as model:
    # Component weight prior.
    pi = pm.Dirichlet('pi', ALPHA_0, testval=np.ones(N_COMPONENTS) / N_COMPONENTS)
    #pi_potential = pm.Potential('pi_potential', tensor.switch(tensor.min(pi) < .01, -np.inf, 0))

    ###################
    # Components plate.
    ###################
    # Component means.
    mus = [pm.MvNormal('mu_{}'.format(i), MU_0, LAMB_0, shape=N_DIMS)
           for i in range(N_COMPONENTS)]

    # Component precisions.
    #lamb = diag(sigma) * corr(corr_shape) * diag(sigma)
    corr_vecs = [
            pm.LKJCorr('corr_vec_{}'.format(i), L_0, N_DIMS)
            for i in range(N_COMPONENTS)
    ]
    # Transform the correlation vector representations to matrices.
    corrs = [
        tensor.fill_diagonal(corr_vecs[i][tri_index], 1.)
        for i in range(N_COMPONENTS)
    ]
    # Stds for the correlation matrices.
    cov_stds = pm.Uniform('cov_stds', BETA_0, BETA_1, shape=(N_COMPONENTS, N_DIMS))
    # Finally re-compose the covariance matrices using diag(sigma) * corr * diag(sigma)
    # Source http://austinrochford.com/posts/2015-09-16-mvn-pymc3-lkj.html
    lambs = []
    for i in range(N_COMPONENTS):
        std_diag = tensor.diag(cov_stds[i])
        cov = std_diag.dot(corrs[i]).dot(std_diag)
        lambs.append(tensor.nlinalg.matrix_inverse(cov))

    stacked_mus = tensor.stack(mus)
    stacked_lambs = tensor.stack(lambs)

    #####################
    # Observations plate.
    #####################
    z = pm.Categorical('z', pi, shape=N_SAMPLES)

    @theano.as_op(itypes=[tensor.dmatrix, tensor.lvector, tensor.dmatrix, tensor.dtensor3],
                  otypes=[tensor.dscalar])
    def likelihood_op(values, z_values, mu_values, prec_values):
        logp = 0.
        for i in range(N_COMPONENTS):
            indices = z_values == i
            if not indices.any():
                continue

            logp += scipy.stats.multivariate_normal(
                mu_values[i], prec_values[i]).logpdf(values[indices]).sum()
        return logp

    def likelihood(values):
        return likelihood_op(values, z, stacked_mus, stacked_lambs)

    y = pm.DensityDist('y', likelihood, observed=target_data)

    step1 = pm.Metropolis(vars=mus + lambs + [pi])
    step2 = pm.ElemwiseCategoricalStep(vars=[z], values=list(range(N_COMPONENTS)))

    trace = pm.sample(100, step=[step1, step2])

我已将此代码pm.ElemwiseCategoricalStep更改为pm.ElemwiseCategorical

logp += scipy.stats.multivariate_normal(mu_values[i], prec_values[i]).logpdf(values[indices])

通过

logp += scipy.stats.multivariate_normal(mu_values[i], prec_values[i]).logpdf(values[indices]).sum()

但我得到了这个例外:

ValueError: expected an ndarray
Apply node that caused the error: Elemwise{Composite{((i0 + i1) - (i2 + i3))}}[(0, 0)](Sum{acc_dtype=float64}.0, FromFunctionOp{likelihood_op}.0, Sum{acc_dtype=float64}.0, FromFunctionOp{likelihood_op}.0)
Toposort index: 101
Inputs types: [TensorType(float64, scalar), TensorType(float64, scalar), TensorType(float64, scalar), TensorType(float64, scalar)]
Inputs shapes: [(), (), (), ()]
Inputs strides: [(), (), (), ()]
Inputs values: [array(-127.70516572917249), -13460.012199423296, array(-110.90354888959129), -13234.61313535326]
Outputs clients: [['output']]

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

我感谢任何帮助。 谢谢!

0 个答案:

没有答案