Question

我们说我们有以下数据集

set.seed(144) 
dat <- matrix(rnorm(100), ncol=5)

以下函数创建所有可能的列组合并删除第一个

(combinations <- do.call(expand.grid, rep(list(c(F, T)), ncol(dat)))[-1,])
#     Var1  Var2  Var3  Var4  Var5
# 2   TRUE FALSE FALSE FALSE FALSE
# 3  FALSE  TRUE FALSE FALSE FALSE
# 4   TRUE  TRUE FALSE FALSE FALSE
# ...
# 31 FALSE  TRUE  TRUE  TRUE  TRUE
# 32  TRUE  TRUE  TRUE  TRUE  TRUE

最后一步是为每个列子集运行k-means聚类，这是一个简单的apply应用程序（我们希望每个kmeans模型中有3个聚类）：

models <- apply(combinations, 1, function(x) kmeans(dat[,x], 3))

我的问题是如何为每个列子集运行分层聚类，而不是kmeans。有什么想法吗？

Answer 1

您可以使用hclust

models <- apply(combinations, 1, function(x) hclust(dist(dat[,x])))
clusters <- apply(combinations, 1, function(x) cutree(hclust(dist(dat[,x])), k = 3))

R：分层聚类

1 个答案: