After reading Cam Davidson-Pilon's Probabilistic Programming & Bayesian Methods for Hackers, I've decided to try my hand at a Hidden Markov Model (HMM) learning problem with PyMC. So far, the code is not cooperating, but through troubleshooting, I feel that I have narrowed down the source of the issue.
Breaking down the code into smaller chunks and focusing on the initial probability and emission probabilities at t=0, I am able learn the emission/observation parameters of a single state at time t=0. However, once I add in another state (for a total of two states), the results of the parameter learning are identical (and incorrect) regardless of data input. So, I feel that I must have done something wrong in the @pm.deterministic
portion of the code, which is not allowing me to sample from the Init
initial probability function.
With this portion of code, I am aiming to learn the initial probability p_bern
and emission probabilities p_0
and p_1
corresponding to states 0 and 1, respectively. The emission is conditional on the state, which is what I am trying to express with my @pm.deterministic
function. Can I have the "if" statement in this determinstic function? It seems to be the root of the problem.
# This code is to test the ability to discern between two states with emissions
import numpy as np
import pymc as pm
from matplotlib import pyplot as plt
N = 1000
state = np.zeros(N)
data = np.zeros(shape=N)
# Generate data
for i in range(N):
state[i] = pm.rbernoulli(p=0.3)
for i in range(N):
if state[i]==0:
data[i] = pm.rbernoulli(p=0.4)
elif state[i]==1:
data[i] = pm.rbernoulli(p=0.8)
# Prior on probabilities
p_bern = pm.Uniform("p_S", 0., 1.)
p_0 = pm.Uniform("p_0", 0., 1.)
p_1 = pm.Uniform("p_1", 0., 1.)
Init = pm.Bernoulli("Init", p=p_bern) # Bernoulli node
@pm.deterministic
def p_T(Init=Init, p_0=p_0, p_1=p_1, p_bern=p_bern):
if Init==0:
return p_0
elif Init==1:
return p_1
obs = pm.Bernoulli("obs", p=p_T, value=data, observed=True)
model = pm.Model([obs, p_bern, p_0, p_1])
mcmc = pm.MCMC(model)
mcmc.sample(20000, 10000)
pm.Matplot.plot(mcmc)
I have already attempted the following to no avail:
@pm.potential
decorator to create a joint distributionInit
location (you can see my comment in the code where I am unsure of where to put it)@pm.stochastic
similar to thisEdit: As per Chris's suggestion, I've moved the Bernoulli node outside of the deterministic. I've also updated the code to a simpler model (Bernoulli observation instead of multinomial) for easier troubleshooting.
Thank you for your time and attention. Any feedback is warmly received. Also, if I am missing any information please let me know!
答案 0 :(得分:2)
我会将这种随机性从确定性中移开。确定性节点的值应该完全由其父节点的值确定。将一个随机变量隐藏在节点中会违反这一点。
为什么dot创建一个伯努利节点,并将其作为参数传递给确定性的?
答案 1 :(得分:2)
根据您提供的更新信息,以下是一些有效的代码:
fseek()
注意在数据生成步骤中,我使用状态来索引适当的真实概率。我基本上在import numpy as np
import pymc as pm
from matplotlib import pyplot as plt
N = 1000
state = np.zeros(N)
data = np.zeros(shape=N)
# Generate data
state = pm.rbernoulli(p=0.3, size=N)
data = [int(pm.rbernoulli(0.8*s or 0.4)) for s in state]
# Prior on probabilities
p_S = pm.Uniform("p_S", 0., 1.)
p_0 = pm.Uniform("p_0", 0., 1.)
p_1 = pm.Uniform("p_1", 0., 1.)
# Use values of Init as indices to probabilities
Init = pm.Bernoulli("Init", p=p_S, size=N) # Bernoulli node
p_T = pm.Lambda('p_T', lambda p_0=p_0, p_1=p_1, i=Init: np.array([p_0, p_1])[i.astype(int)])
obs = pm.Bernoulli("obs", p=p_T, value=data, observed=True)
model = pm.MCMC(locals())
model.sample(20000, 10000)
model.summary()
的规范中做同样的事情。它似乎工作得相当好,但请注意,根据事物初始化的位置,p_T
和p_0
的两个值最终可能对应于任何一个真值(没有任何约束比另一个大。)因此,p_1
的值最终可以作为真实状态概率的补充。