scikit-learn IsotonicRegression类似乎在其得分方法中出现故障。
我正在为它提供完美的漂浮数字数组,拟合方法可以在不引发任何异常的情况下工作,然后得分方法会引发以下错误:
.../anaconda/lib/python2.7/site-packages/scipy/interpolate/interpolate.py:445: RuntimeWarning: invalid value encountered in true_divide
slope = (y_hi - y_lo) / (x_hi - x_lo)[:, None]
Traceback (most recent call last):
File "iostonic_error.py", line 20, in <module>
print model.score(x_1d,y_1d)
File ".../anaconda/lib/python2.7/site-packages/sklearn/base.py", line 324, in score
return r2_score(y, self.predict(X), sample_weight=sample_weight)
File ".../anaconda/lib/python2.7/site-packages/sklearn/metrics/metrics.py", line 2324, in r2_score
y_type, y_true, y_pred = _check_reg_targets(y_true, y_pred)
File ".../anaconda/lib/python2.7/site-packages/sklearn/metrics/metrics.py", line 65, in _check_reg_targets
y_true, y_pred = check_arrays(y_true, y_pred)
File ".../anaconda/lib/python2.7/site-packages/sklearn/utils/validation.py", line 283, in check_arrays
_assert_all_finite(array)
File ".../anaconda/lib/python2.7/site-packages/sklearn/utils/validation.py", line 43, in _assert_all_finite
" or a value too large for %r." % X.dtype)
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
以下是重新创建此错误的一些最小示例代码:
from sklearn.isotonic import IsotonicRegression
x_1d = [ 3.97948718,4.,2.,2.97948718,4.48974359,4.46923077, 2.,3.46923077,4.46923077,4.46923077,3.97948718,3.97948718, 5.46923077,3.,2.48974359]
y_1d = [ 19.,9.,27.,27.,12.,17.,34.,30.,23.,25.,18.,21.,11.,24.,33.]
model = IsotonicRegression(y_min=0, increasing=False, out_of_bounds='clip')
model.fit(x_1d,y_1d)
print model.score(x_1d,y_1d)
我查看了源代码,发现此错误消息来自sklearn.utils.validations._assert_all_finite 作为测试,在输入数据(x_1d和y_1d)上使用该函数的精确代码,我正在为回归类提供信息。它没有引发任何异常,这导致我相信某个地方,分数方法以破坏它的方式操纵数据并引入缺失或np.inf值。
有没有人知道造成这种情况的原因是什么?