使用EllipticEnvelope时的RuntimeWarning

时间:2019-06-28 09:50:31

标签: python numpy scikit-learn

我正在尝试使用椭圆信封来检测数据中的异常值。但是,缩放后,我收到许多运行时警告,使我无法进行所需的预测。

C:\HOMEWARE\Anaconda3-Windows-x86_64\lib\site-packages\sklearn\covariance\robust_covariance.py:677: RuntimeWarning: invalid value encountered in true_divide
  self.dist_ /= correction
C:\HOMEWARE\Anaconda3-Windows-x86_64\lib\site-packages\sklearn\covariance\robust_covariance.py:716: RuntimeWarning: invalid value encountered in less
  mask = self.dist_ < chi2(n_features).isf(0.025)
C:\HOMEWARE\Anaconda3-Windows-x86_64\lib\site-packages\sklearn\covariance\robust_covariance.py:720: RuntimeWarning: Mean of empty slice.
  location_reweighted = data[mask].mean(0)
C:\HOMEWARE\Anaconda3-Windows-x86_64\lib\site-packages\numpy\core\_methods.py:78: RuntimeWarning: invalid value encountered in true_divide
  ret, rcount, out=ret, casting='unsafe', subok=False)
C:\HOMEWARE\Anaconda3-Windows-x86_64\lib\site-packages\numpy\lib\function_base.py:392: RuntimeWarning: Mean of empty slice.
  avg = a.mean(axis)
C:\HOMEWARE\Anaconda3-Windows-x86_64\lib\site-packages\sklearn\covariance\empirical_covariance_.py:81: RuntimeWarning: Degrees of freedom <= 0 for slice
  covariance = np.cov(X.T, bias=1)
C:\HOMEWARE\Anaconda3-Windows-x86_64\lib\site-packages\numpy\lib\function_base.py:2451: RuntimeWarning: divide by zero encountered in true_divide
  c *= np.true_divide(1, fact)
C:\HOMEWARE\Anaconda3-Windows-x86_64\lib\site-packages\numpy\lib\function_base.py:2451: RuntimeWarning: invalid value encountered in multiply
  team  Family       date      Amount  scaled_amount
0  S  Engineering 2018-01-05   -0.02      -1.000000
1  S  Engineering 2018-02-06   -0.01      -0.333333
2  S  Engineering 2018-03-06    0.00       0.333333
3  S  Engineering 2018-04-06    0.00       0.333333
4  S  Engineering 2018-05-07    0.00       0.333333
5  S  Engineering 2018-06-06    0.00       0.333333
6  S  Engineering 2018-07-05    0.00       0.333333
7  S  Engineering 2018-08-06    0.00       0.333333
8  S  Engineering 2018-09-06    0.00       0.333333
9  S  Engineering 2018-10-04    0.01       1.000000

我用于预测的代码如下:

scaled_amount_reshaped = key.scaled_amount.values.reshape(-1, 1)
model = EllipticEnvelope(contamination=0.18)
model.fit(abs(scaled_amount_reshaped))
prediction = model.predict(abs(scaled_amount_reshaped))

1 个答案:

答案 0 :(得分:0)

尝试:

model = EllipticEnvelope(contamination=0.18 , support_fraction=0.7)

或通过以下方式检查: 如果有块,请尝试使用DASK并混合为以下代码: Tensorflow-DASK

import numpy as np
from sklearn.covariance import EllipticEnvelope
true_cov = np.array(your matrix)
X = np.random.RandomState(0).multivariate_normal(mean=[0, 0],cov=true_cov,size=500)

model = EllipticEnvelope(random_state=0).fit(X)
# predict returns 1 for an inlier and -1 for an outlier
model.predict(X`) #abs(scaled_amount_reshaped)