运行PyMC时出错 - 随机值不在其支持范围内,或者它禁止其父母'当前价值

时间:2014-11-18 07:21:28

标签: python r pymc jags r2jags

我正在尝试使用PyMC作为输入四种不同的医学状况预测因子,并将它们结合起来产生一个总体后验概率,患者具有给定预测因子子集的条件,说是"是的,这病人有这种情况"。

这个想法是从β分布中为每个预测变量选择θ(条件的总体速率)和假阴性和假阳性率,并从中使用贝叶斯定理计算边际概率和后验概率。我有一个16个计数值的数组,每个可能的预测变量组合一个(因为有4个预测变量,有2 ** 4 = 16种不同的预测变量组合)。我将这最后一组计数和边际概率提供给多项分布,类似于在PyMC教程http://pymc-devs.github.io/pymc/tutorial.html的以下示例中,如何将disasters_array与泊松分布一起使用。

这是我编写的代码尝试并执行此操作:

from pymc import Multinomial, Beta, deterministic
from numpy import array

n = 4 # number of predictors

counts_array = array([2942808473, 17491655, 21576, 23189, 339805, 89159, 168214, 76044, 43138288, 530963, 22682, 22169, 462052, 129472, 2804257, 3454104]) # 16 counts - one count value for each possible permutation of predictors that detected medical condition
pred = array([[0,0,0,0],[1,0,0,0],[0,1,0,0],[1,1,0,0],[0,0,1,0],[1,0,1,0],[0,1,1,0],[1,1,1,0],[0,0,0,1],[1,0,0,1],[0,1,0,1],[1,1,0,1],[0,0,1,1],[1,0,1,1],[0,1,1,1],[1,1,1,1]]); # array of yes/no's from predictors for each value in counts_array above

theta = Beta('theta',1,2)

fn = fp = tn = tp = [0] * 4;
for i in range(0,n):
    fn[i] = Beta('fn' + str(i),1,2)
    fp[i] = Beta('fp' + str(i),1,2)
    tn[i] = 1 - fp[i]
    tp[i] = 1 - fn[i]

@deterministic(plot=False)
def margprobs():
    mp = [0] * 2**n; # initialize with vector of 2**n zeros
    for i in range(0,2**n):
        mp[i] = theta *\
            (fn[0]**(1-pred[i][0])) * (tp[0]**pred[i][0]) *\
            (fn[1]**(1-pred[i][1])) * (tp[1]**pred[i][1]) *\
            (fn[2]**(1-pred[i][2])) * (tp[2]**pred[i][2]) *\
            (fn[3]**(1-pred[i][3])) * (tp[3]**pred[i][3])\
            + (1-theta) *\
            (tn[0]**(1-pred[i][0])) * (fp[0]**pred[i][0]) *\
            (tn[1]**(1-pred[i][1])) * (fp[1]**pred[i][1]) *\
            (tn[2]**(1-pred[i][2])) * (fp[2]**pred[i][2]) *\
            (tn[3]**(1-pred[i][3])) * (fp[3]**pred[i][3]);
    return mp;

@deterministic(plot=False)
def postprobs():
    pp = [0] * 2**n; # initialize with vector of 2**n zeros
    for i in range(0,2**n):
        pp[i] = theta *\
            (fn[0]**(1-pred[i][0])) * (tp[0]**pred[i][0]) *\
            (fn[1]**(1-pred[i][1])) * (tp[1]**pred[i][1]) *\
            (fn[2]**(1-pred[i][2])) * (tp[2]**pred[i][2]) *\
            (fn[3]**(1-pred[i][3])) * (tp[3]**pred[i][3])\
            / margprobs[i];
    return pp;

counts = Multinomial(name="counts", value=counts_array, n=2**n, p=margprobs, observed=True)

当我运行它时,在计算计数时,我得到与最后一行有关的错误:

$ python test.py
Traceback (most recent call last):
  File "test.py", line 46, in <module>
    counts = Multinomial(name="counts", value=counts_array, n=2**n, p=margprobs, observed=True)
  File "/Users/jtr4v/anaconda/lib/python2.7/site-packages/pymc/distributions.py", line 3269, in __init__
    verbose=verbose, **kwds)
  File "/Users/jtr4v/anaconda/lib/python2.7/site-packages/pymc/PyMCObjects.py", line 772, in __init__
    if not isinstance(self.logp, float):
  File "/Users/jtr4v/anaconda/lib/python2.7/site-packages/pymc/PyMCObjects.py", line 929, in get_logp
    raise ZeroProbability(self.errmsg)
pymc.Node.ZeroProbability: Stochastic counts's value is outside its support,
 or it forbids its parents' current values

显然这个错误与PyMC有关,不喜欢我给Multinomial()提供的一些价值,但我不确定哪一个是错的。我认为值应该是counts_array(我的观察值为count),n应该是16,因为我想选择16个项目的数组用于计数,一个用于每个可能的预测变量组合,p应该是我的边际概率,并且观察到应该是真的,因为我观察了价值观。

我做错了什么?

编辑:如果有帮助,我之前在R2jags中使用以下代码执行此操作:

model {
    theta ~ dbeta(1,2); # draw theta from a beta distribution
    for (i in 1:N) { # draw an false positive and false negative rate for each of N predictors
        fp[i] ~ dbeta(1,2);
        fn[i] ~ dbeta(1,2);
        tp[i] <- 1-fn[i]; # true positive = 1 - false negative rate
        tn[i] <- 1-fp[i]; # true negative rate = 1 - false positive rate
    }
    for (j in 1:M) {
    # Bayes theorem, i.e.
    # posterior probability =
    # P(A) * P(B|A) /
    # /
    # P(A) * P(B|A) + P(-A) * P(B|-A)  # <--- last line is marginal probability
    #
    # x is a vector of 1's and 0's indicating whether the ith predictor said yes or no
    margprobs[j] <- (theta *
                        (fn[1]^(1-x[j,1])) * (tp[1]^x[j,1]) *
                        (fn[2]^(1-x[j,2])) * (tp[2]^x[j,2]) *
                     (fn[3]^(1-x[j,3])) * (tp[3]^x[j,3]) *
                     (fn[4]^(1-x[j,4])) * (tp[4]^x[j,4])
                 + (1-theta) *
                        (tn[1]^(1-x[j,1])) * (fp[1]^x[j,1]) *
                     (tn[2]^(1-x[j,2])) * (fp[2]^x[j,2]) *
                     (tn[3]^(1-x[j,3])) * (fp[3]^x[j,3]) *
                     (tn[4]^(1-x[j,4])) * (fp[4]^x[j,4]));

    postprobs[j] <- theta *
                        (fn[1]^(1-x[j,1])) * (tp[1]^x[j,1]) *
                        (fn[2]^(1-x[j,2])) * (tp[2]^x[j,2]) *
                        (fn[3]^(1-x[j,3])) * (tp[3]^x[j,3]) *
                        (fn[4]^(1-x[j,4])) * (tp[4]^x[j,4])
                        / margprobs[j];


    }
    counts ~ dmulti(margprobs, total);
}

0 个答案:

没有答案