Question

我在Python中使用sklearn进行主成分分析。

我的一个目的是生成一个4组件模型，另一个模型是8，并使用inverse_transform与原始数据进行比较。

代码是这样的：

pca4 = PCA(n_components=4)
pca4.fit(parkinsonData)
scores4=pca.transform(parkinsonData)
reconstruct4=pca.inverse_transform(scores4)

为了计算原始数据和重建之间的差异，我做了：

differenceMatrix=parkinsonData-reconstruct4

现在我有差异，但我想计算丢失的数据原始数据集。为此，我想计算原始数据集的每个元素与重建的元素之间由2驱动的差异的平均值。

在最后一个语句中，我计算了原始数据集的每个元素与重建的元素之间的差异，但现在我必须计算功率。我不知道该怎么做，因为当我使用时：

power=differenceMatrix**

我收到错误：此矩阵不是正方形。

要解决这个问题，请使用

np.power(differenceMatrix,differenceMatrix)

它有效，但有些元素是NAN。我明白这是因为缺乏方形。

任何人都知道如何解决这个问题，并使用PCA计算原始数据集和转换数据集之间丢失的数据？

谢谢。

Answer 1

To square (I guess that's what you mean with "powered by 2") each element of the matrix, use:

np.square(differenceMatrix)

This works element-wise and does not restrict you to matrices of square shape. NaNs in the matrix are returned as NaN in the output.