我在使用PyMC时遇到了一些问题,这是一个用于蒙特卡罗模拟的python包。我想知道这是PyMC中的错误还是我的问题。
在这个例子中,我们有一个只有两个节点的模型:A,B。 B是依赖于A的分类,在某种意义上,A的值用于通过(确定性)帮助函数(确定性地)获得B的概率:
from pymc import Categorical, deterministic, MCMC
from pylab import hist, show
ITER=5000
BURN = 0
THETA = 0.5
A = Categorical('A', p=[0.5, 0.5])
@deterministic(plot=False)
def B_helper(a=A):
if A == 1:
return [THETA, 1 - THETA]
else:
return [1 - THETA, THETA]
B = Categorical('B', p = B_helper, value=0, observed=True)
M = MCMC([A,B])
M.sample(iter=ITER, burn=BURN, thin=1)
hist(M.trace('A')[:])
show()
print("\n The probability of A=0 equals: %f" \
% ((M.trace('A')[:]==0).sum()/float(ITER-BURN)))
对于THETA = 0.5
人们会期望完全对称。贝叶斯规则给出了P(A=0|B=0) = 0.5
。我获得了远离它的东西(它似乎收敛到1/3)。
现在,如果我通过添加" dummy"来改变上面的代码。在节点中概率为0的第一个状态,我确实得到了预期的结果:
from pymc import Categorical, deterministic, MCMC
from pylab import hist, show
ITER=50000
BURN = 0
THETA = 0.5
A = Categorical('A', p=[0, 0.5, 0.5]) # Note the first state is "dummy" with zero probability.
@deterministic(plot=False)
def B_helper(a=A):
if A == 0:
return 0
elif A == 1:
return [0, THETA, 1 - THETA] # Note the first state is "dummy" with zero probability.
else:
return [0, 1 - THETA, THETA] # Note the first state is "dummy" with zero probability.
B = Categorical('B', p = B_helper, value=1, observed=True)
M = MCMC([A,B])
M.sample(iter=ITER, burn=BURN, thin=1)
hist(M.trace('A')[:])
show()
print("\n The probability of A=1 equals: %f" \
% ((M.trace('A')[:]==0).sum()/float(ITER-BURN)))
这里状态0是A和B的虚拟状态。对于THETA = 0.5
我应该得到(由于贝叶斯规则的对称性)P(A=1|B=1) = 0.5
。我明白了。此外,"虚拟"的位置。国家很重要。如果我将虚拟对象设置为状态2,就像下面的代码一样,我再次得到错误的结果。
from pymc import Categorical, deterministic, MCMC
from pylab import hist, show
ITER=50000
BURN = 0
THETA = 0.5
A = Categorical('A', p=[0.5, 0.5, 0]) # Note I changed the order: dummy state is now the third.
@deterministic(plot=False)
def B_helper(a=A):
if A == 0:
return 0
elif A == 1:
return [THETA, 1 - THETA, 0] # Note I changed the order: dummy state is now the third.
else:
return [1 - THETA, THETA, 0] # Note I changed the order: dummy state is now the third.
B = Categorical('B', p = B_helper, value=1, observed=True)
M = MCMC([A,B])
M.sample(iter=ITER, burn=BURN, thin=1)
hist(M.trace('A')[:])
show()
print("\n The probability of A=1 equals: %f" \
% ((M.trace('A')[:]==0).sum()/float(ITER-BURN)))
我要么监督一些微妙的东西(希望不是完全明显的),要么分类在pymc中做一些奇怪的事情(logp值似乎没问题)。
我已经在github网站上发布了这个,但可以理解的是,获取答案需要更长的时间。如果这是我的一些问题,我想我会在这里更快地知道。任何帮助赞赏。的问候,