我有以下代码
X = df_X.as_matrix(header[1:col_num])
scaler = preprocessing.StandardScaler().fit(X)
X_nor = scaler.transform(X)
并出现以下错误:
File "/Users/edamame/Library/python_virenv/lib/python2.7/site-packages/sklearn/utils/validation.py", line 54, in _assert_all_finite
" or a value too large for %r." % X.dtype)
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
我用过:
print(np.isinf(X))
print(np.isnan(X))
给出了下面的输出。这无法告诉我哪个元素有问题,因为我有数百万行。
[[False False False ..., False False False]
[False False False ..., False False False]
[False False False ..., False False False]
...,
[False False False ..., False False False]
[False False False ..., False False False]
[False False False ..., False False False]]
有没有办法确定矩阵X中哪个值实际导致问题?一般人们如何避免它?
答案 0 :(得分:6)