如何修复r:'比不同数据点更多的集群中心的kmeans错误'

时间:2013-06-13 21:05:58

标签: r k-means hierarchical-clustering

当我运行kmeans算法时,我收到此错误:

Error in kmeans(x, 2, 15) : 
  more cluster centers than distinct data points.

如何解决此错误及其含义?我认为我的数据点不同?

以下是我的文件和我用来生成kmeans的代码:

rnames.csv : 
"a1","a2","a3"

cells.csv : 
0,1,2,1,4,3,5,3,4

cnames.csv : 
"google","so","test"

cells = c(read.csv("c:\\data-files\\kmeans\\cells.csv", header = TRUE))
rnames = c(read.csv("c:\\data-files\\kmeans\\rnames.csv", header = TRUE))
cnames = c(read.csv("c:\\data-files\\kmeans\\cnames.csv", header = TRUE))

x <- matrix(cells, nrow=3, ncol=3, byrow=TRUE, dimnames=list(rnames, cnames))

# run K-Means
km <- kmeans(x, 2, 15)

1 个答案:

答案 0 :(得分:2)

修复此问题是使用:

cells = c(read.csv("c:\\data-files\\kmeans\\cells.csv", header = FALSE))
rnames = c(read.csv("c:\\data-files\\kmeans\\rnames.csv", header = FALSE))
cnames = c(read.csv("c:\\data-files\\kmeans\\cnames.csv", header = FALSE))

而不是

cells = c(read.csv("c:\\data-files\\kmeans\\cells.csv", header = TRUE))
rnames = c(read.csv("c:\\data-files\\kmeans\\rnames.csv", header = TRUE))
cnames = c(read.csv("c:\\data-files\\kmeans\\cnames.csv", header = TRUE))