Question

我在这里使用jupyter笔记本是内核信息

Python 3.5.2 | Anaconda 4.1.1（64位）| （默认，2016年7月2日，17：53：06） [GCC 4.4.7 20120313（Red Hat 4.4.7-1）]

我正在使用k-means聚类。当我聚类时，唯一使用的颜色是蓝色。这对于它目前的设置方式来说不是一个大问题，但我需要将其扩大，以便颜色需要不同。我按照教程，所以我不理解所有的代码100％。代码如下。

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import style
style.use("ggplot")
from sklearn.cluster import KMeans

x = [1,5,1.5,8,1,9]
y = [2,8,1.8,8,.6,11]

plt.scatter(x,y)
plt.show()

X = np.array([[1,2],[5,8],[1.5,1.8],[8,8],[1,.6],[9,11]])

kmeans = KMeans(n_clusters=2)
kmeans.fit(X)

centroids = kmeans.cluster_centers_
labels = kmeans.labels_

print(centroids)
print(labels)

colors = ['r','b','y','g','c','m']

for i in range(len(X)):
    print("coordinate:",X[i], "label:", labels[i])
    plt.plot(X[i][0], X[i][1], colors[labels[i]], markersize = 10)

plt.scatter(centroids[:, 0],centroids[:, 1], marker = "x", s=150, linewidths = 5, zorder = 10)

plt.show()

plt.scatter(x,y)
plt.scatter(centroids[:, 0],centroids[:, 1], marker = "x", s=150, linewidths = 5, zorder = 10)

plt.show()

我认为我的问题在于它。

colors = ['r','b','y','g','c','m']

for i in range(len(X)):
    print("coordinate:",X[i], "label:", labels[i])
    plt.plot(X[i][0], X[i][1], colors[labels[i]], markersize = 10)

Answer 1

我确实错了。我之前的解决方案不正确。我终于可以好好看看标签和质心的返回，我认为这应该是你所要求的。

你可以给一个序列作为color =参数的参数，所以不需要fol-loop

colors = ['r','b','y','g','c','m']
plt.scatter(x,y, color=[colors[l_] for l_ in labels], label=labels)
plt.scatter(centroids[:, 0],centroids[:, 1], color=[c for c in colors[:len(centroids)]], marker = "x", s=150, linewidths = 5, zorder = 10)

Answer 2

使用 K 意味着您希望每个集群都具有不同的颜色。如果您有 2 个集群，那么您的模型 kmeans 会将其标签以类似于 kmeans.labels_ 的数组形式存储在 [1 1 1 1 0 0 1 0 0 0 1 0 0...] 中。要使用特定颜色，请在开始所有绘图代码之前遍历此代码并使用列表设置每个点的颜色：

colors = []
for i in kmeans.labels_:
  if i == 0:
    colors.append('blue')
  elif i == 1:
    colors.append('orange')

如果您想为您的颜色使用预定义的 Seaborn 调色板，您也可以遍历调色板！例如，如果您想使用“深”调色板：

palette = sns.color_palette('deep')
colors = []
for i in kmeans.labels_:
  if i == 0:
    colors.append(palette[0])
  elif i == 1:
    colors.append(palette[1])

如果您有 3 种颜色，则需要为 elif 添加另一个 i == 2，依此类推。

然后，当您创建绘图时，只需将 c 参数设置为等于您创建的 colors 列表：

plt.scatter(df['x'], df['y'], c = colors)
plt.show()

K-means聚类颜色不变

2 个答案: