我正在尝试使用椭圆信封来检测数据中的异常值。但是,缩放后,我收到许多运行时警告,使我无法进行所需的预测。
C:\HOMEWARE\Anaconda3-Windows-x86_64\lib\site-packages\sklearn\covariance\robust_covariance.py:677: RuntimeWarning: invalid value encountered in true_divide
self.dist_ /= correction
C:\HOMEWARE\Anaconda3-Windows-x86_64\lib\site-packages\sklearn\covariance\robust_covariance.py:716: RuntimeWarning: invalid value encountered in less
mask = self.dist_ < chi2(n_features).isf(0.025)
C:\HOMEWARE\Anaconda3-Windows-x86_64\lib\site-packages\sklearn\covariance\robust_covariance.py:720: RuntimeWarning: Mean of empty slice.
location_reweighted = data[mask].mean(0)
C:\HOMEWARE\Anaconda3-Windows-x86_64\lib\site-packages\numpy\core\_methods.py:78: RuntimeWarning: invalid value encountered in true_divide
ret, rcount, out=ret, casting='unsafe', subok=False)
C:\HOMEWARE\Anaconda3-Windows-x86_64\lib\site-packages\numpy\lib\function_base.py:392: RuntimeWarning: Mean of empty slice.
avg = a.mean(axis)
C:\HOMEWARE\Anaconda3-Windows-x86_64\lib\site-packages\sklearn\covariance\empirical_covariance_.py:81: RuntimeWarning: Degrees of freedom <= 0 for slice
covariance = np.cov(X.T, bias=1)
C:\HOMEWARE\Anaconda3-Windows-x86_64\lib\site-packages\numpy\lib\function_base.py:2451: RuntimeWarning: divide by zero encountered in true_divide
c *= np.true_divide(1, fact)
C:\HOMEWARE\Anaconda3-Windows-x86_64\lib\site-packages\numpy\lib\function_base.py:2451: RuntimeWarning: invalid value encountered in multiply
team Family date Amount scaled_amount
0 S Engineering 2018-01-05 -0.02 -1.000000
1 S Engineering 2018-02-06 -0.01 -0.333333
2 S Engineering 2018-03-06 0.00 0.333333
3 S Engineering 2018-04-06 0.00 0.333333
4 S Engineering 2018-05-07 0.00 0.333333
5 S Engineering 2018-06-06 0.00 0.333333
6 S Engineering 2018-07-05 0.00 0.333333
7 S Engineering 2018-08-06 0.00 0.333333
8 S Engineering 2018-09-06 0.00 0.333333
9 S Engineering 2018-10-04 0.01 1.000000
我用于预测的代码如下:
scaled_amount_reshaped = key.scaled_amount.values.reshape(-1, 1)
model = EllipticEnvelope(contamination=0.18)
model.fit(abs(scaled_amount_reshaped))
prediction = model.predict(abs(scaled_amount_reshaped))
答案 0 :(得分:0)
尝试:
model = EllipticEnvelope(contamination=0.18 , support_fraction=0.7)
或通过以下方式检查: 如果有块,请尝试使用DASK并混合为以下代码: Tensorflow-DASK
import numpy as np
from sklearn.covariance import EllipticEnvelope
true_cov = np.array(your matrix)
X = np.random.RandomState(0).multivariate_normal(mean=[0, 0],cov=true_cov,size=500)
model = EllipticEnvelope(random_state=0).fit(X)
# predict returns 1 for an inlier and -1 for an outlier
model.predict(X`) #abs(scaled_amount_reshaped)