使用pymc拟合二项分布会引发某些FillValues的{Zero}错误

时间:2015-07-22 13:02:58

标签: python statistics bayesian pymc mcmc

我不确定我是否在pymc中发现了一个错误。看起来拟合具有缺失数据的二项式可能会产生ZeroProbability错误,具体取决于掩盖缺失数据的所选fill_value。但也许我错误地使用它。 我用github的当前master分支尝试了以下示例。我知道bug concerning Binomial distributions in pymc 2.3.4,但这似乎是一个不同的问题。

我使用pymc进行二项分布,一切都按预期工作:

import scipy as sp
import pymc

def make_model(observed_values):
    p = pymc.Uniform('p', lower = 0.0, upper = 1.0, value = 0.1)
    values = pymc.Binomial('values', n = 10* sp.ones_like(observed_values), p = p * sp.ones_like(observed_values),\
                             value = observed_values, observed = True, plot = False)
    values = pymc.Binomial('values', n = 10, p = p,\
                             value = observed_values, observed = True, plot = False)
    return locals()

sp.random.seed(0)
observed_values = sp.random.binomial(n = 10.0, p = 0.1, size = 100)

M1 = pymc.MCMC(make_model(observed_values))
M1.sample(iter=10000, burn=1000, thin=10)
pymc.Matplot.plot(M1)
M1.summary()

输出:

  [-----------------100%-----------------] 10000 of 10000 complete in 0.7 sec
Plotting p

  p:

          Mean             SD               MC Error        95% HPD interval
          ------------------------------------------------------------------
          0.093            0.007            0.0              [ 0.081  0.107]


          Posterior quantiles:

          2.5             25              50              75             97.5
          |---------------|===============|===============|---------------|
          0.08             0.088           0.093          0.097         0.106

现在,我尝试了一个非常相似的情况,区别在于缺少一个观察值:

mask = sp.zeros_like(observed_values)
mask[0] = True
masked_values = sp.ma.masked_array(observed_values, mask = mask, fill_value = 999999)

M2 = pymc.MCMC(make_model(masked_values))
M2.sample(iter=10000, burn=1000, thin=10)
pymc.Matplot.plot(M2)
M2.summary()

出乎意料的是,我收到了ZeroProbability错误:

---------------------------------------------------------------------------
ZeroProbability                           Traceback (most recent call last)
<ipython-input-16-4f945f269628> in <module>()
----> 1 M2 = pymc.MCMC(make_model(masked_values))
      2 M2.sample(iter=10000, burn=1000, thin=10)
      3 pymc.Matplot.plot(M2)
      4 M2.summary()

<ipython-input-12-cb8707bb911f> in make_model(observed_values)
      4 def make_model(observed_values):
      5     p = pymc.Uniform('p', lower = 0.0, upper = 1.0, value = 0.1)
----> 6     values = pymc.Binomial('values', n = 10* sp.ones_like(observed_values), p = p * sp.ones_like(observed_values),                             value = observed_values, observed = True, plot = False)
      7     values = pymc.Binomial('values', n = 10, p = p,                             value = observed_values, observed = True, plot = False)
      8     return locals()

/home/fabian/anaconda/lib/python2.7/site-packages/pymc/distributions.pyc in __init__(self, *args, **kwds)
    318                     logp_partial_gradients=logp_partial_gradients,
    319                     dtype=dtype,
--> 320                     **arg_dict_out)
    321 
    322     new_class.__name__ = name

/home/fabian/anaconda/lib/python2.7/site-packages/pymc/PyMCObjects.pyc in __init__(self, logp, doc, name, parents, random, trace, value, dtype, rseed, observed, cache_depth, plot, verbose, isdata, check_logp, logp_partial_gradients)
    773         if check_logp:
    774             # Check initial value
--> 775             if not isinstance(self.logp, float):
    776                 raise ValueError(
    777                     "Stochastic " +

/home/fabian/anaconda/lib/python2.7/site-packages/pymc/PyMCObjects.pyc in get_logp(self)
    930                     (self._value, self._parents.value))
    931             else:
--> 932                 raise ZeroProbability(self.errmsg)
    933 
    934         return logp

ZeroProbability: Stochastic values's value is outside its support,
or it forbids its parents' current values.

但是,如果我将蒙版数组中的填充值更改为1,则拟合再次起作用:

masked_values2 = sp.ma.masked_array(observed_values, mask = mask, fill_value = 1)

M3 = pymc.MCMC(make_model(masked_values2))
M3.sample(iter=10000, burn=1000, thin=10)
pymc.Matplot.plot(M3)
M3.summary()

输出:

[-----------------100%-----------------] 10000 of 10000 complete in 2.1 sec
Plotting p

p:

        Mean             SD               MC Error        95% HPD interval
        ------------------------------------------------------------------
        0.092            0.007            0.0              [ 0.079  0.105]


        Posterior quantiles:

        2.5             25              50              75             97.5
        |---------------|===============|===============|---------------|
        0.079            0.088           0.092          0.097         0.105


values:

        Mean             SD               MC Error        95% HPD interval
        ------------------------------------------------------------------
        1.15             0.886            0.029                  [ 0.  3.]


        Posterior quantiles:

        2.5             25              50              75             97.5
        |---------------|===============|===============|---------------|
        0.0              1.0             1.0            2.0           3.0

这是一个错误还是我的模型有问题? 谢谢你的帮助!

0 个答案:

没有答案