我试图在PyMC3上重现这个tutorial(参见LASSO回归)的结果。正如对这个reddit thread所评论的那样,前两个系数的混合并不好,因为变量是相关的。
我尝试在PyMC3中实现它,但在使用哈密顿量采样器时它没有按预期工作。我只能使用Metropolis采样器,它可以获得与PyMC2相同的结果。
我不知道它是否与拉普拉斯算子达到峰值这一事实有关(在0处不连续导数),但它与高斯先验的效果非常好。我尝试使用或不使用MAP初始化,结果始终相同。
这是我的代码:
from pymc import *
from scipy.stats import norm
import pylab as plt
# Same model as the tutorial
n = 1000
x1 = norm.rvs(0, 1, size=n)
x2 = -x1 + norm.rvs(0, 10**-3, size=n)
x3 = norm.rvs(0, 1, size=n)
y = 10 * x1 + 10 * x2 + 0.1 * x3
with Model() as model:
# Laplacian prior only works with Metropolis sampler
coef1 = Laplace('x1', 0, b=1/sqrt(2))
coef2 = Laplace('x2', 0, b=1/sqrt(2))
coef3 = Laplace('x3', 0, b=1/sqrt(2))
# Gaussian prior works with NUTS sampler
#coef1 = Normal('x1', mu = 0, sd = 1)
#coef2 = Normal('x2', mu = 0, sd = 1)
#coef3 = Normal('x3', mu = 0, sd = 1)
likelihood = Normal('y', mu= coef1 * x1 + coef2 * x2 + coef3 * x3, tau = 1, observed=y)
#step = Metropolis() # Works just like PyMC2
start = find_MAP() # Doesn't help
step = NUTS(state = start) # Doesn't work
trace = sample(10000, step, start = start, progressbar=True)
plt.figure(figsize=(7, 7))
traceplot(trace)
plt.tight_layout()
autocorrplot(trace)
summary(trace)
这是我得到的错误:
PositiveDefiniteError: Simple check failed. Diagonal contains negatives
我做错了什么,或者NUTS采样器不应该在这样的情况下工作?