我有206×23矩阵称为“na.college”和1乘23矩阵称为“塔夫茨”。它们的列名相同,因为它们来自一个数据集。
通过下面的clust分析,我需要6个聚类。
# Now I make two data sets private and public.
sub_Pr<-subset(na.college, subset = Public..1...Private..2. %in% 2)
plot(hc1<-hclust(dist(sub_Pr[,c(-1,-2,-3)]),method="complete"),hang=-1, main="Private")
groups <- cutree(hc1, 6) # cut tree into "a" clusters
# draw dendogram with red borders around the "a" clusters
rect.hclust(hc1, a1, border="red")
# your matrix dimensions have to match with the clustering results
# remove any columns from na.college, as you did for clustering
mat <- na.college
Pr_1 <- mat[which(groups==1),]
Pr_2 <- mat[which(groups==2),]
Pr_3 <- mat[which(groups==3),]
Pr_4 <- mat[which(groups==4),]
Pr_5 <- mat[which(groups==5),]
Pr_6 <- mat[which(groups==6),]
这6个集群也包含23列。我有1比23的数据称为“塔夫茨”。 现在,我想要做的是计算每个星团与欧几里得的“塔夫茨”之间的距离。
如果您有答案,请告诉我。
答案 0 :(得分:0)
使用循环:
distances <- data.frame(numeric(), numeric())
for (i in 1:6) {
centroid <- apply(mat[which(groups==i),],2,mean)
d <- dist(rbind(centroid,Tufts))
distances <- rbind(distances,c(i,d))
}
colnames(distances) <- c("group","distance")
这也是同样的事情,并且更符合R编程理念,但它无可救药地深奥。
f <- function(i)
return(c(i,dist(rbind(Tufts,apply(mat[which(groups==i),],2,mean)))))
do.call("rbind",lapply(1:6,f))