Question

sklearn PCA是否将数据帧的列视为要减少的向量，或将行视为要减少的向量？

因为这样做：

df=pd.DataFrame([[1,-21,45,3,4],[4,5,89,-5,6],[7,-4,58,1,19]‌,[10,11,74,20,12],[1‌3,14,15,45,78]]) #5 rows 5 columns
pca=PCA(n_components=3)
pca.fit(df)
df_pcs=pd.DataFrame(data=pca.components_, index = df.index)

我收到以下错误：

ValueError：传递值的形状为（5,3），index表示暗示（5,5）

Answer 1

行代表示例，列代表功能。 PCA减少了数据的维度，即功能。所以列。

因此，如果您正在讨论向量，那么它会将一行视为单个特征向量并减小其大小。

如果您有一个形状为[100, 6]的数据框，并且PCA n_components设置为3.那么您的输出将为[100, 3]。

# You need this
df_pcs=pca.transform(df)

# This produces error because shapes dont match.
df_pcs=pd.DataFrame(data=pca.components_, index = df.index)

pca.components_是一个[3,5]的数组，而您的index参数正在使用形状为df.index的{{1}}。因此错误。 [5,]代表完全不同的东西。

根据文件： -

components_：array，[n_components，n_features]
pca.components_

PCA sklearn - 它需要哪个方面

1 个答案: