我在Scikit中使用K-means进行了聚类。然后,我根据Scikit example绘制了聚类区域。接下来,对于每个聚类,我再次进行聚类,并且我想在同一个图上显示子聚类的边界。我发现这个question很有意思,但是当我应用这个方法时,轴范围发生了变化,并出现了一个新的图。
已编辑:我的功能如下:
def plot_pca_clusters_races_match(pca_km, reduced_data, pca_data_winner,
race1_pca_km, race1_reduced_data, race1_pca_data_winner, race1_nclusters,
race2_pca_km, race2_reduced_data, race2_pca_data_winner, race2_nclusters,
plt_opt, fig_path, race_approach, n_clusters):
"""
:param pca_km: K-means trained by PCA data (2 components)
:param reduced_data: PCA components
:param data_winner: player_id, pca_component1, pca_component2, race_id, winner
:param plt_opt: space required to plot cluster area
:param fig_path: path to save the plot
:param race_approach:
:param n_clusters:
:return:
"""
race_id_list = ['Z', 'T', 'P']
# 1- Plot cluster area
x_min, x_max = reduced_data[:, 0].min() + plt_opt[0], reduced_data[:, 0].max() + plt_opt[1]
y_min, y_max = reduced_data[:, 1].min() + plt_opt[2], reduced_data[:, 1].max() + plt_opt[3]
step = abs((abs(x_max) - abs(x_min))) / 100
xx, yy = np.meshgrid(np.arange(x_min, x_max, step), np.arange(y_min, y_max, step))
Z = pca_km.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
plt.figure(1)
plt.clf()
# Plot cluster regions
plt.imshow(Z, interpolation='nearest',
extent=(xx.min(), xx.max(), yy.min(), yy.max()),
cmap=plt.cm.Paired,
aspect='auto', origin='lower')
# 2- Plot cluster members
race_ids = list(set(pca_data_winner[:, -3]))
# Find race type
reduced_data_race1 = pca_data_winner[np.where(pca_data_winner[:, -3] == race_ids[0]), :][0]
# Plot race 1
plt.plot(reduced_data_race1[:, 2], reduced_data_race1[:, 3], 'k.', markersize=4, color='red',
label=race_id_list[int(race_ids[0])])
# Plot race 2
# If the race is non-symmetric, change color of the cluster members
if len(race_ids) > 1:
reduced_data_race2 = pca_data_winner[np.where(pca_data_winner[:, -3] == race_ids[1]), :][0]
plt.plot(reduced_data_race2[:, 2], reduced_data_race2[:, 3], 'k.', markersize=4, color='green',
label=race_id_list[int(race_ids[1])], hold=True)
# 3-Plot cluster centers
markers = ['d', 'v', 's', '*', 'h', 'p', 'o']
for cluster in range(0, pca_km.cluster_centers_.shape[0]):
plt.scatter(pca_km.cluster_centers_[cluster, 0], pca_km.cluster_centers_[cluster, 1],
marker=markers[cluster], s=80, linewidths=1,
label='Cluster ' + str(cluster),
color='b', zorder=4, hold=True)
plt.xlabel('PC 1')
plt.ylabel('PC 2')
plt.legend(prop={'size':8})
# --------------------------------------------- Plot boundaries of sub-clusters
x1_min, x1_max = race1_reduced_data[:, 0].min() + plt_opt[0], race1_reduced_data[:, 0].max() + plt_opt[1]
y1_min, y1_max = race1_reduced_data[:, 1].min() + plt_opt[2], race1_reduced_data[:, 1].max() + plt_opt[3]
step = abs((abs(x_max) - abs(x_min))) / 100
xx1, yy1 = np.meshgrid(np.arange(x1_min, x1_max, step), np.arange(y1_min, y1_max, step))
Z1 = race1_pca_km.predict(np.c_[xx1.ravel(), yy1.ravel()])
Z1 = Z1.reshape(xx1.shape)
# Plot sub-cluster boundaries
plt.contour(Z, extent=(xx.min(), xx.max(), yy.min(), yy.max()))
答案 0 :(得分:0)
没有轮廓的第一个绘图位于第二个绘图的左下角。这是因为轮廓没有给出适当的比例(在这种情况下,它将简单地扩展到Z阵列的行和列索引。
您需要提供轮廓范围
plt.contour(Z, extent=(..,..,..,..))
或指定一些X和Y数组来确定坐标。
plt.contour(X,Y,Z)