我有一个干净的数据集,其nan值为零,但是我在回归器上仍然遇到相同的错误。我的框架称为new_player_data
我尝试通过
查找任何内容list(new_player_data.where(new_player_data.isna()).count() > 0)
返回
[假, 假, 假, 假, 假, False]
大约200次。我以为可能会有一些太大的浮动。我尝试过:
for i in new_player_data.columns[:]:
if new_player_data[i].dtype == float:
new_player_data[i] = round(new_player_data[i],2)
无论我得到什么:
regressor.fit(X_train, y_train)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-327-3a664017ddaa> in <module>
----> 1 regressor.fit(X_train, y_train)
/anaconda3/lib/python3.7/site-packages/sklearn/ensemble/forest.py in fit(self, X, y, sample_weight)
248
249 # Validate or convert input data
--> 250 X = check_array(X, accept_sparse="csc", dtype=DTYPE)
251 y = check_array(y, accept_sparse='csc', ensure_2d=False, dtype=None)
252 if sample_weight is not None:
/anaconda3/lib/python3.7/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
571 if force_all_finite:
572 _assert_all_finite(array,
--> 573 allow_nan=force_all_finite == 'allow-nan')
574
575 shape_repr = _shape_repr(array.shape)
/anaconda3/lib/python3.7/site-packages/sklearn/utils/validation.py in _assert_all_finite(X, allow_nan)
54 not allow_nan and not np.isfinite(X).all()):
55 type_err = 'infinity' if allow_nan else 'NaN, infinity'
---> 56 raise ValueError(msg_err.format(type_err, X.dtype))
57
58
ValueError: Input contains NaN, infinity or a value too large for dtype('float32').
关于我在这里还要检查的其他任何想法?不知所措
答案 0 :(得分:0)
通过@gmds来获得答案,结果证明它是inf值,是通过
找到的pd.options.mode.use_inf_as_na = True
infs = np.where(np.isinf(new_player_data))
infs
out: (array([], dtype=int64), array([], dtype=int64))
然后我就这样替换了它们
tidyverse
感谢gmds的定向帮助!