Question

我试图了解如何在scikit-learn Ridge

中实现Ridge回归

岭回归具有最小化（y - Xw）^ 2 + \ alpha * | w | ^ 2的闭合形式解，即（X'* X + \ alpha * I）^ { - 1} X'y

拟合模型的截距和系数似乎与封闭形式解决方案不同。任何想法如何在scikit-learn中实现岭回归？

from sklearn import datasets
from sklearn.linear_model import Ridge
import matplotlib.pyplot as plt
import numpy as np

# prepare dataset
boston = datasets.load_boston()
X = boston.data
y = boston.target
# add the w_0 intercept where the corresponding x_0 = 1
Xp = np.concatenate([np.ones((X.shape[0], 1)), X], axis=1)

alpha = 0.5
ridge = Ridge(fit_intercept=True, alpha=alpha)
ridge.fit(X, y)

# 1. intercept and coef of the fit model
print np.array([ridge.intercept_] + list(ridge.coef_))
# output:
# array([  3.34288615e+01,  -1.04941233e-01,   4.70136803e-02,
     2.52527006e-03,   2.61395134e+00,  -1.34372897e+01,
     3.83587282e+00,  -3.09303986e-03,  -1.41150803e+00,
     2.95533512e-01,  -1.26816221e-02,  -9.05375752e-01,
     9.61814775e-03,  -5.30553855e-01])

# 2. the closed form solution
print np.linalg.inv(Xp.T.dot(Xp) + alpha * np.eye(Xp.shape[1])).dot(Xp.T).dot(y)
# output:
# array([  2.17772079e+01,  -1.00258044e-01,   4.76559911e-02,
    -6.63573226e-04,   2.68040479e+00,  -9.55123875e+00,
     4.55214996e+00,  -4.67446118e-03,  -1.25507957e+00,
     2.52066137e-01,  -1.15766049e-02,  -7.26125030e-01,
     1.14804636e-02,  -4.92130481e-01])

Answer 1

棘手的一点是拦截。您拥有的封闭式解决方案是缺少拦截，当您向数据追加1列时，您还会在拦截术语中添加L2惩罚。 Scikit-learn ridge回归没有。

如果你想对偏见进行L2惩罚，那么只需在Xp上调用ridge（并在构造函数中关闭拟合偏差），你就得到：

>>> ridge = Ridge(fit_intercept=False, alpha=alpha)
>>> ridge.fit(Xp, y)
>>> print np.array(list(ridge.coef_))
[  2.17772079e+01  -1.00258044e-01   4.76559911e-02  -6.63573226e-04
   2.68040479e+00  -9.55123875e+00   4.55214996e+00  -4.67446118e-03
  -1.25507957e+00   2.52066137e-01  -1.15766049e-02  -7.26125030e-01
   1.14804636e-02  -4.92130481e-01]

Answer 2

分析解决方案

是正确的

（X'X +αI）^-1 X'y ，

但问题是 X 和 y 是什么。实际上有两种不同的解释：

在您的分析计算中，您实际上正在使用 X _p ，其中1s列已添加到 X 之前（用于拦截），并使用原始的 y 。这就是你在上面的等式中得到的结论。
在sklearn中，解释是不同的。首先通过减去其平均值（即截距）来标准化 y _n 。然后，计算在 X 和 y _n 上执行。

很清楚为什么你认为你的解释是正确的，因为在OLS中没有区别。但是，当你添加Ridge惩罚时，你的解释也会对第一列的系数产生不利影响，这一点并没有多大意义。

如果您执行以下操作

alpha = 0.5
ridge = Ridge(fit_intercept=True, alpha=alpha)
ridge.fit(X, y - np.mean(y))
# 1. intercept and coef of the fit model
print np.array([ridge.intercept_] + list(ridge.coef_))


Xp = Xp - np.mean(Xp, axis=0)
# 2. the closed form solution
print np.linalg.inv(Xp.T.dot(Xp) + alpha * np.eye(Xp.shape[1])).dot(Xp.T).dot(y)

然后你会看到相同的结果。

理解sci-kit中的岭线性回归学习

2 个答案: