Question

目前我正在探索kmeans功能。我有一个简单的文本文件（test.txt），包含以下条目。数据可以分为2个集群。

如何绘制kmeans函数（使用plot函数）的结果以及原始数据？我也有兴趣观察簇如何与它们的质心一起分布？

Answer 1

这是example(kmeans)的示例：

# This is just to generate example data
test <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2),
           matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2))
colnames(test) <- c("V1", "V2")

#store the kmeans in a variable called cl
(cl <- kmeans(test, 2))

# plot it and also plot the points of the centeroids
plot(test, col = cl$cluster)
points(cl$centers, col = 1:2, pch = 8, cex = 2)

修改

OP还有一些其他问题：

(cl <- kmeans(test, 2)) plot(test, col = cl$cluster) points(cl$centers, col = 1:2, pch = 8, cex = 2)

以上代码导致：

(cl <- kmeans(test[,1], 2)) plot(test[,1], col = cl$cluster) points(cl$centers, col = 1:2, pch = 8, cex = 2)

以上代码导致：

(cl <- kmeans(test[,1], 2)) plot(cbind(0,test[,1]), col = cl$cluster) points(cbind(0,cl$centers), col = 1:2, pch = 8, cex = 2)

以上代码导致：

<强>解释

在情况1中，数据具有两个维度（V1，V2），因此质心在图中具有两个坐标。在案例2中，数据是一维的（V1），就像您的数据一样。 R给每个点一个索引，这导致x值是索引值，质心也只有一个坐标，这就是为什么你一直看到它们一直到图的左边。案例3是一维数据实际上看起来如果你只在一个维度上绘制它。

<强>结论

您的数据是一维的，如果您在二维中绘制它，您会得到类似于案例二的内容，其中x值由R给出，它们是索引值。如此绘制它并没有多大意义。

k表示用r中的原始数据绘图

1 个答案: