完成的wls示例代码可以在这里找到:Weighted Least Squares,为方便起见,我会复制它。
In [1]: from __future__ import print_function
...: import numpy as np
...: from scipy import stats
...: import statsmodels.api as sm
...: import matplotlib.pyplot as plt
...: from statsmodels.sandbox.regression.predstd import wls_prediction_std
...: from statsmodels.iolib.table import (SimpleTable, default_txt_fmt)
...: np.random.seed(1024)
...:
In [2]: nsample = 50
...: x = np.linspace(0, 20, nsample)
...: X = np.column_stack((x, (x - 5)**2))
...: X = sm.add_constant(X)
...: beta = [5., 0.5, -0.01]
...: sig = 0.5
...: w = np.ones(nsample)
...: w[nsample * 6 // 10:] = 3
...: y_true = np.dot(X, beta)
...: e = np.random.normal(size=nsample)
...: y = y_true + sig * w * e
...: X = X[:,[0,1]]
...:
In [3]:
...: mod_wls = sm.WLS(y, X, weights=1./w)
...: res_wls = mod_wls.fit()
...: print(res_wls.summary())
...:
令我困惑的是代码中的这一行:
mod_wls = sm.WLS(y, X, weights=1./w)
假设权重与(与其成反比)成反比 观察的方差。也就是说,如果变量是 换算为1 / sqrt(W),您必须提供权重= 1 / W。
所以,不应该通过以下方式构建wls模型:
mod_wls = sm.WLS(y, X, weights=1./w ** 2)
我错过了什么吗?