Question

抱怨这条线：

log_centers = pca.inverse_transform(centers)

代码：

# TODO: Apply your clustering algorithm of choice to the reduced data 
clusterer = KMeans(n_clusters=2, random_state=0).fit(reduced_data)

# TODO: Predict the cluster for each data point
preds = clusterer.predict(reduced_data)

# TODO: Find the cluster centers
centers = clusterer.cluster_centers_

log_centers = pca.inverse_transform(centers)

数据：

log_data = np.log(data)

good_data = log_data.drop(log_data.index[outliers]).reset_index(drop = True)

pca = PCA(n_components=2)
pca = pca.fit(good_data)

reduced_data = pca.transform(good_data)

reduced_data = pd.DataFrame(reduced_data, columns = ['Dimension 1', 'Dimension 2'])

数据是csv;标题看起来像：

    Fresh   Milk    Grocery Frozen  Detergents_Paper    Delicatessen
0   14755   899 1382    1765    56  749
1   1838    6380    2824    1218    1216    295
2   22096   3575    7041    11422   343 2564

Answer 1

问题在于pca.inverse_transform()不应将clusters作为参数。

事实上，如果你看一下documentation，它应该从PCA 获取的数据应用于原始数据和不< / strong>使用KMeans获得的质心。

ValueError：形状（2,2）和（4,6）未对齐：2（dim 1）！= 4（dim 0）

1 个答案: