如何在PyMC3中定义自定义先验

时间:2014-07-12 04:23:03

标签: python bayesian pymc3

我想知道是否可以在PyMC3中定义自定义优先级(以及如何操作)。从here来看,似乎在PyMC2中相对容易做到(不需要修改源代码),但在PyMC3中并不那么容易(或者我不理解某些东西)。 我试图在“做贝叶斯数据分析”一书中复制一个先验,这是在BUGS中实现的:

model {
# Likelihood. Each flip is Bernoulli.
for ( i in 1 : N1 ) { y1[i]  ̃ dbern( theta1 ) }
for ( i in 1 : N2 ) { y2[i]  ̃ dbern( theta2 ) }
# Prior. Curved scallo not ps!
x  ̃ dunif(0,1)
y  ̃ dunif(0,1)
N <- 4
xt <- sin( 2*3.141593*N * x ) / (2*3.141593*N) + x
yt <- 3 * y + (1/3)
xtt <- pow( xt , yt )
theta1 <- xtt
theta2 <- y
}

先验并没有太多意义,它只是一个如何定义自定义先验和BUGS多功能性的例子。

我尝试实现上述自定义优先级是:

from __future__ import division
import numpy as np
import pymc as pm
from pymc import Continuous
from theano.tensor import sin, log

# Generate the data
y1 = np.array([1, 1, 1, 1, 1, 0, 0])  # 5 heads and 2 tails
y2 = np.array([1, 1, 0, 0, 0, 0, 0])  # 2 heads and 5 tails

class Custom_prior(Continuous): 
"""
custom prior
"""
    def __init__(self, y, *args, **kwargs):
        super(Custom_prior, self).__init__(*args, **kwargs)
        self.y = y
        self.N = 4
        self.mean = 0.625  # FIXME
    def logp(self, value):
        N = self.N
        y = self.y
        return -log((sin(2*3.141593*N * value)
                     / (2*3.141593*N) + value)**(3 * y + (1/3)))

with pm.Model() as model:
    theta2 = pm.Uniform('theta2', 0, 1)  # prior
    theta1 = Custom_prior('theta1', theta2)  # prior
    # define the likelihood
    y1 = pm.Bernoulli('y1', p=theta1, observed=y1)
    y2 = pm.Bernoulli('y2', p=theta2, observed=y2)
    # Generate a MCMC chain
    start = pm.find_MAP()  # Find starting value by optimization
    trace = pm.sample(5000, pm.NUTS(), progressbar=False)

修改

按照chris-fonnesbeck

的回答

我想我需要这样的东西:

with pm.Model() as model:
    theta2 = pm.Uniform('theta2', 0, 1)  # prior
    N = 4
    theta1 = pm.DensityDist('theta1', lambda value: -log((sin(2*3.141593*N * value)
                       / (2*3.141593*N) + value)**(3 * theta2 + (1/3))))
    # define the likelihood
    y1 = pm.Bernoulli('y1', p=theta1, observed=y1)
    y2 = pm.Bernoulli('y2', p=theta2, observed=y2)

    # Generate a MCMC chain
    start = pm.find_MAP()  # Find starting value by optimization
    trace = pm.sample(10000, pm.NUTS(), progressbar=False) # Use NUTS sampling

唯一的问题是我得到了theta1和theta2的所有后验样本的相同值,我想我的自定义先验或先验和可能性的组合存在一些问题。可以在此example

中找到自定义优先级的成功定义

1 个答案:

答案 0 :(得分:3)

你能发布完整的BUGS模型吗?在上面,它看起来像是在x和y的先验之后的BUGS中的一系列确定性变换,而不是先前的定义。

假设上面的logp是您想要的,您可以更简单地在PyMC中实现它:

def logp(value, y):
    N  = 4
    return -log((sin(2*3.141593*N * value)
                 / (2*3.141593*N) + value)**(3 * y + (1/3)))

theta1 = pm.DensityDist('theta1', logp, value, y=theta2)