对于下面的代码,我得到'Sepal.Length,Sepal.Width'聚类的输出,但是我也想要哪些数据点属于哪个集群,怎么做呢?
newiris <- iris
> newiris$Species <- NULL
> (kc <- kmeans(newiris, 3))
K-means clustering with 3 clusters of sizes 38, 50, 62
Cluster means:
Sepal.Length Sepal.Width Petal.Length Petal.Width
1 6.850000 3.073684 5.742105 2.071053
2 5.006000 3.428000 1.462000 0.246000
3 5.901613 2.748387 4.393548 1.433871
Clustering vector:
[1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[30] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 1 3 3 3 3 3
[59] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 1 3 3 3 3 3 3 3 3 3
[88] 3 3 3 3 3 3 3 3 3 3 3 3 3 1 3 1 1 1 1 3 1 1 1 1 1 1 3 3 1
[117] 1 1 1 3 1 3 1 3 1 1 3 3 1 1 1 1 1 3 1 1 1 1 3 1 1 1 3 1 1
[146] 1 3 1 1 3
Within cluster sum of squares by cluster:
[1] 23.87947 15.15100 39.82097
Available components:
[1] "cluster" "centers" "withinss" "size"
> table(iris$Species, kc$cluster)
1 2 3
setosa 0 50 0
versicolor 2 0 48
virginica 36 0 14
> plot(newiris[c("Sepal.Length", "Sepal.Width")], col=kc$cluster)
> points(kc$centers[,c("Sepal.Length", "Sepal.Width")], col=1:3, pch=8, cex=2)
答案 0 :(得分:2)
你已经向我们展示了答案。 kc$cluster
是每个观察分组到哪个群集。它默认打印,您可以查看str(kc)
以查看kmeans
函数返回的内容。
str(kc)
## List of 9
## $ cluster : int [1:150] 1 3 3 3 1 1 1 1 3 3 ...
## $ centers : num [1:3, 1:4] 5.18 6.31 4.74 3.62 2.9 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : chr [1:3] "1" "2" "3"
## .. ..$ : chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
## $ totss : num 681
## $ withinss : num [1:3] 6.43 118.65 17.67
## $ tot.withinss: num 143
## $ betweenss : num 539
## $ size : int [1:3] 33 96 21
## $ iter : int 2
## $ ifault : int 0
## - attr(*, "class")= chr "kmeans"
答案 1 :(得分:1)
正如Thomas和Mamoun所说,群集信息在kc$cluster
中,与原始观察的顺序相同。这可以添加回原始数据集,如下所示:
newiris <- cbind(newiris, cluster = kc$cluster)
head(newiris)
Sepal.Length Sepal.Width Petal.Length Petal.Width cluster
1 5.1 3.5 1.4 0.2 1
2 4.9 3.0 1.4 0.2 1
3 4.7 3.2 1.3 0.2 1
4 4.6 3.1 1.5 0.2 1
5 5.0 3.6 1.4 0.2 1
6 5.4 3.9 1.7 0.4 1