Question

我有206×23矩阵称为“na.college”和1乘23矩阵称为“塔夫茨”。它们的列名相同，因为它们来自一个数据集。

通过下面的clust分析，我需要6个聚类。

# Now I make two data sets private and public.
sub_Pr<-subset(na.college, subset = Public..1...Private..2.   %in% 2)
plot(hc1<-hclust(dist(sub_Pr[,c(-1,-2,-3)]),method="complete"),hang=-1, main="Private")


groups <- cutree(hc1, 6) # cut tree into "a" clusters
# draw dendogram with red borders around the "a" clusters 
rect.hclust(hc1, a1, border="red")

# your matrix dimensions have to match with the clustering results
# remove any columns from na.college, as you did for clustering
mat <- na.college    

Pr_1 <- mat[which(groups==1),]
Pr_2 <- mat[which(groups==2),]
Pr_3 <- mat[which(groups==3),]
Pr_4 <- mat[which(groups==4),]
Pr_5 <- mat[which(groups==5),]
Pr_6 <- mat[which(groups==6),]

这6个集群也包含23列。我有1比23的数据称为“塔夫茨”。现在，我想要做的是计算每个星团与欧几里得的“塔夫茨”之间的距离。

如果您有答案，请告诉我。

Answer 1

使用循环：

distances <- data.frame(numeric(), numeric())
for (i in 1:6) {
  centroid  <- apply(mat[which(groups==i),],2,mean)
  d         <- dist(rbind(centroid,Tufts))
  distances <- rbind(distances,c(i,d))
}
colnames(distances) <- c("group","distance")

这也是同样的事情，并且更符合R编程理念，但它无可救药地深奥。

f <- function(i) 
       return(c(i,dist(rbind(Tufts,apply(mat[which(groups==i),],2,mean)))))
do.call("rbind",lapply(1:6,f))

列数为23的向量之间的距离

1 个答案: