我正在尝试将PCA应用于多变量分析,并绘制前两个组件的得分图,其中在Hotthon中使用Hotelling T2置信椭圆。我能够得到散点图,我想在散点图中添加95%置信度椭圆。如果有人知道如何在python中完成它会很棒。
预期产出的样本图片:
答案 0 :(得分:0)
pca库提供Hotelling T2和SPE / DmodX离群值检测。
pip install pca
from pca import pca
import pandas as pd
import numpy as np
# Create dataset with 100 samples
X = np.array(np.random.normal(0, 1, 500)).reshape(100, 5)
# Create 5 outliers
outliers = np.array(np.random.uniform(5, 10, 25)).reshape(5, 5)
# Combine data
X = np.vstack((X, outliers))
# Initialize model. Alpha is the threshold for the hotellings T2 test to determine outliers in the data.
model = pca(alpha=0.05)
# Fit transform
out = model.fit_transform(X)
使用
打印离群值print(out['outliers'])
# y_proba y_score y_bool y_bool_spe y_score_spe
# 1.0 9.799576e-01 3.060765 False False 0.993407
# 1.0 8.198524e-01 5.945125 False False 2.331705
# 1.0 9.793117e-01 3.086609 False False 0.128518
# 1.0 9.743937e-01 3.268052 False False 0.794845
# 1.0 8.333778e-01 5.780220 False False 1.523642
# .. ... ... ... ... ...
# 1.0 6.793085e-11 69.039523 True True 14.672828
# 1.0 2.610920e-291 1384.158189 True True 16.566568
# 1.0 6.866703e-11 69.015237 True True 14.936442
# 1.0 1.765139e-292 1389.577522 True True 17.183093
# 1.0 1.351102e-291 1385.483398 True True 17.319038
进行情节
model.biplot(legend=True, SPE=True, hotellingt2=True)