回归与“单向”噪音

时间:2014-03-05 10:09:59

标签: python regression pymc mcmc

我想从数据中估计简单线性函数和伽马分布噪声项的参数。 (注意:这是https://stats.stackexchange.com/questions/88676/regression-with-unidirectional-noise的后续问题,但是简化且更具体化实施。假设我的观察数据生成如下:

import numpy as np
np.random.seed(0)

size = 200
true_intercept = 1
true_slope = 2

# Generate observed data
x_ = np.linspace(0, 1, size)
true_regression_line = true_intercept + true_slope * x_  # y = a + b*x
noise_ = np.random.gamma(shape=1.0, scale=1.0, size=size)
y_ = true_regression_line + noise_

看起来如下: enter image description here

我尝试使用pymc估算这些参数如下:

from pymc import Normal, Gamma, Uniform, Model, MAP
# Define priors
intercept = Normal('intercept', 0, tau=0.1)
slope = Normal('slope', 0, tau=0.1)
alpha = Uniform('alpha', 0, 2)
beta = Uniform('beta', 0, 2)
noise = Gamma('noise', alpha=alpha, beta=beta, size=size)

# Give likelihood > 0 to models where the regression line becomes larger than
# any of the datapoint
y = Normal('y', mu=intercept + slope * x_ + noise, tau=100,
           observed=True, value=y_)

# Perform MAP fit of model
model = Model([alpha, beta, intercept, slope, noise])
map_ = MAP(model)
map_.fit()

然而,这给了我远远超出真实值的估计值:

  • 拦截:真:1.000,估计:3.281
  • 斜率:真:2.000,估计值:-3.400

我做错了吗?

1 个答案:

答案 0 :(得分:0)

您似乎正在指定正常可能性以及Gamma噪声,因此您要向模型添加额外的高斯噪声,这似乎是不合理的。尝试将似然性表示为Gamma,而不是正常,因为这是残差的分布。