PyMC中的在线贝叶斯学习(反复更新后验信念)

时间:2013-10-18 01:01:37

标签: python machine-learning pymc

以下模型是PyMC教程的一部分,名为disaster_model.py,可以在主代码中导入以用作模型:

"""
A model for the disasters data with a changepoint

changepoint ~ U(0, 110)
early_mean ~ Exp(1.)
late_mean ~ Exp(1.)
disasters[t] ~ Po(early_mean if t <= switchpoint, late_mean otherwise)

"""

from pymc import *
from numpy import array, empty
from numpy.random import randint

__all__ = ['disasters_array', 'switchpoint', 'early_mean', 'late_mean', 'rate', 'disasters']

disasters_array =   array([ 4, 5, 4, 0, 1, 4, 3, 4, 0, 6, 3, 3, 4, 0, 2, 6, 
                            3, 3, 5, 4, 5, 3, 1, 4, 4, 1, 5, 5, 3, 4, 2, 5, 
                            2, 2, 3, 4, 2, 1, 3, 2, 2, 1, 1, 1, 1, 3, 0, 0, 
                            1, 0, 1, 1, 0, 0, 3, 1, 0, 3, 2, 2, 0, 1, 1, 1, 
                            0, 1, 0, 1, 0, 0, 0, 2, 1, 0, 0, 0, 1, 1, 0, 2, 
                            3, 3, 1, 1, 2, 1, 1, 1, 1, 2, 4, 2, 0, 0, 1, 4, 
                            0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1])

# Define data and stochastics

switchpoint = DiscreteUniform('switchpoint', lower=0, upper=110, doc='Switchpoint[year]')
early_mean = Exponential('early_mean', beta=1.)
late_mean = Exponential('late_mean', beta=1.)

@deterministic(plot=False)
def rate(s=switchpoint, e=early_mean, l=late_mean):
    ''' Concatenate Poisson means '''
    out = empty(len(disasters_array))
    out[:s] = e
    out[s:] = l
    return out

disasters = Poisson('disasters', mu=rate, value=disasters_array, observed=True)

现在可以使用MCMC Metropolis Hasting算法从分布中进行采样,以获得参数的后验分布。

from pymc.examples import disaster_model
from pymc import MCMC
M = MCMC(disaster_model)
M.sample(iter=10000, burn=1000, thin=10)

现在我的问题是假设在这次采样后我获得了新数据。之后如何更新我的后验分布?基本上如何使用PyMC实现在线学习?

1 个答案:

答案 0 :(得分:1)

您需要为更新指定新模型。这样做的原因是,现在您将拥有用于未知参数的信息先验。具体来说,切换点上的DiscreteUniform将是分类或多项(n = 1),并且速率参数可能都是正态分布的。您可以将这些先验(使用几种方法之一)拟合到模型第一次运行的后验样本中。如果您计划重复更新,则可以以编程方式轻松执行此更新。