Question

这是我在做

时得到的scikit-learn错误

my_estimator = LassoLarsCV(fit_intercept=False, normalize=False, positive=True, max_n_alphas=1e5)

请注意，如果我将max_n_alphas从1e5减少到1e4，我不会再出现此错误。

任何人都知道发生了什么事？

当我致电

时会发生错误

my_estimator.fit(x, y)

我在40k维度中有40个数据点。

完整堆栈跟踪看起来像这样

  File "/usr/lib64/python2.7/site-packages/sklearn/linear_model/least_angle.py", line 1113, in fit
    axis=0)(all_alphas)
  File "/usr/lib64/python2.7/site-packages/scipy/interpolate/polyint.py", line 79, in __call__
    y = self._evaluate(x)
  File "/usr/lib64/python2.7/site-packages/scipy/interpolate/interpolate.py", line 498, in _evaluate
    out_of_bounds = self._check_bounds(x_new)
  File "/usr/lib64/python2.7/site-packages/scipy/interpolate/interpolate.py", line 525, in _check_bounds
    raise ValueError("A value in x_new is below the interpolation "
ValueError: A value in x_new is below the interpolation range.

Answer 1

您的数据必须有特定的内容。 txtQuantity似乎正在使用这个相当良好的数据的合成示例：

LassoLarsCV()

这是sklearn 0.16，我没有import numpy import sklearn.linear_model # create 40000 x 40 sample data from linear model with a bit of noise npoints = 40000 ndims = 40 numpy.random.seed(1) X = numpy.random.random((npoints, ndims)) w = numpy.random.random(ndims) y = X.dot(w) + numpy.random.random(npoints) * 0.1 clf = sklearn.linear_model.LassoLarsCV(fit_intercept=False, normalize=False, max_n_alphas=1e6) clf.fit(X, y) # coefficients are almost exactly recovered, this prints 0.00377 print max(abs( clf.coef_ - w )) # alphas actually used are 41 or ndims+1 print clf.alphas_.shape选项。

我不知道你为什么要使用非常大的max_n_alphas。虽然我不知道为什么1e + 4工作而1e + 5不工作，但我怀疑你从max_n_alphas = ndims + 1和max_n_alphas = 1e + 4获得的路径或者对于表现良好的数据而言是相同的。此外，通过positive=True中的交叉验证估计的最佳alpha也将是相同的。查看Lasso path using LARS示例，了解alpha正在尝试执行的操作。

另外，来自LassoLars documentation

alphas_ array，shape（n_alphas + 1，）

最大协方差（in   绝对值）每次迭代。 n_alphas是max_iter，   n_features，或具有相关性的路径中的节点数   大于alpha，以较小者为准。

所以我们以上面的大小为ndims + 1（即n_features + 1）的alphas_结束是有意义的。

P.S。使用sklearn 0.17.1和positive = True测试，也测试了一些正负系数，结果相同：alphas_是ndims + 1或更少。

ValueError：x_new中的值低于插值范围

1 个答案: