我在pymc3中实现了一个线性回归模型,其中未知的权重向量被约束为概率质量函数,因此被建模为Dirichlet分布,如下面的代码所示:
with pm.Model() as model:
#prior on precision of normal likelihood
tau = pm.Gamma('tau', alpha=1, beta=1)
phi = np.empty(ncountries, dtype=object)
y = np.empty((nyears-1, ncountries), dtype=object)
for icountry, country in enumerate(countries):
#prior Dirichlet allocation for each country
phi[icountry] = pm.Dirichlet('mix_{c}'.format(c=country),
np.roll(mix, icountry),
shape=ncountries)
for iyear, year in enumerate(years[1:]):
suffix = '_{y}-{c}'.format(y=year, c=country)
previous_pop = Xs[iyear, :]
#likelihood
y[iyear, icountry] = pm.Normal('obs' + suffix,
mu=pm.Deterministic(
'mu' + suffix,
dot(phi[icountry], previous_pop)),
tau=tau,
observed=Ys[iyear, icountry])
通过跑步取样后验:
start = pm.find_MAP()
step = pm.Metropolis()
nsteps = 1000
trace = pm.sample(nsteps, step, start=start)
我分析了Dirichlet变量的踪迹,发现它们的值不会加到一个(下面是一个例子):
array([[ 0.01029745, 0.00627394, 0.00996922, ..., 1.83955829,
0.00962185, 0.01020659],
[ 0.01029745, 0.00627394, 0.00996922, ..., 1.83955829,
0.00962185, 0.01020659],
[ 0.01029745, 0.00627394, 0.00996922, ..., 1.83955829,
0.00962185, 0.01020659],
...,
[ 0.02050308, 0.01685555, 0.01976797, ..., 1.92278065,
0.03956622, 0.00473735],
[ 0.01993214, 0.01632033, 0.01994876, ..., 1.92487858,
0.04078728, 0.00481424],
[ 0.01900882, 0.01528191, 0.02100671, ..., 1.92485693,
0.0395159 , 0.00524575]])
我不熟悉theano变量,发现很难探索Dirichlet RV如何在pymc3中表达...我做错了什么,或者我应该将跟踪中返回的值归一化,以便它们总和为一个?
快速更新
看起来函数pm.find_MAP()
采用了一种梯度下降优化。这没有考虑由于表示来自Dirichlet分布的绘制的向量是概率质量函数(其值应该是正的并且它们的和应该是1)的事实所导致的约束。这种约束显然也没有在算法的采样阶段强制执行,并且随着似然分布的精度向零漂移而引起收敛问题。