我使用hclust()
对某些文本数据使用stringdist
执行了层次式聚类。我在字符串之间得到了一个相异矩阵,并将其命名为distancemodels
。
现在我正在尝试使用以下代码找到每个群集的中心:
dists = as.data.frame(distancemodels)
dists$ID = as.integer(rownames(dists))
# this adds the clusters information
dists = merge(dists,clusters[,c(1,4)])
#k = number of clusters
meds = as.vector(1:k)
#This for loop is throwing the following error: Error in colMeans(dists[dists$cluster == i, as.character(dists$ID[dists$cluster == :'x' must be an array of at least two dimensions
for(i in 1:k){meds[i] = as.integer(names(colMeans(dists[dists$cluster == i,as.character(dists$ID[dists$cluster == i])])[unname(which(colMeans(dists[dists$cluster == i,as.character(dists$ID[dists$cluster == i])])==min(colMeans(dists[dists$cluster == i,as.character(dists$ID[dists$cluster == i])]))))]))[1]}
medians = as.data.frame(unlist(t(t(meds))))
medians$cluster = rownames(medians)
由于我无法从互联网上找到有关如何为hclust
找到群集质心的任何帮助,这是我写的。请让我知道我哪里出错了。我是R的新手。