sklearn PLSRegression - 由潜在向量解释的X的方差

时间:2017-09-20 17:04:55

标签: python machine-learning scikit-learn regression variance

我使用Python的sklearn.cross_decomposition.PLSRegression

执行偏最小二乘回归

对于每个PLS组件,是否有办法检索X的解释方差分数,即 R 2 (X)?我正在寻找类似于R pls包中的explvar()函数的东西。但是,我也很感激有关如何自己计算它的任何建议。

有一个类似的question并且有一个answer解释了如何得到Y的方差。我猜,“Y中的方差”就是那种情况下的要求。这就是为什么我开了一个新问题 - 希望是O.K。

2 个答案:

答案 0 :(得分:3)

我设法找到问题的解决方案。以下给出了PLS回归后每个潜在向量解释的X方差分数:

import numpy as np
from sklearn import cross_decomposition

# X is a numpy ndarray with samples in rows and predictor variables in columns
# y is one-dimensional ndarray containing the response variable

total_variance_in_x = np.var(X, axis = 0)

pls1 = cross_decomposition.PLSRegression(n_components = 5)
pls1.fit(X, y) 

# variance in transformed X data for each latent vector:
variance_in_x = np.var(pls1.x_scores_, axis = 0) 

# normalize variance by total variance:
fractions_of_explained_variance = variance_in_x / total_variance_in_x

答案 1 :(得分:0)

我不确定这一点,所以如果有人可以贡献一些东西......

遵循以下这些:

https://ro-che.info/articles/2017-12-11-pca-explained-variance

https://www.ibm.com/docs/de/spss-statistics/24.0.0?topic=reduction-total-variance-explained

variance_in_x = np.var(pls1.x_scores_, axis = 0) 
fractions_of_explained_variance = variance_in_x / np.sum(variance_in_x)