Question

我遇到了一个问题，我可以用标签绘制垂直树状图，但是当它是水平时我无法添加标签。

我的数据如下：

Company Industry1 Industry2 Industry3
Google     3%        5%        6%
Apple      2%        6%        1%

当我导入数据时，第一列包含我的标签，但行只有1,2,3等。

所以我的代码是：数据源名为Cluster_D

labs = Cluster_D[, 1]
Industry <- Cluster_D
rownames(Industry) <- labs$`Company`


D.Industry <- dist(scale(round(Industry[, -1], 3)), method = "euclidean")
H.Industry <- hclust(D.Industry, method = "ward.D")
plot(H.Industry, labels = Cluster_D$`Company`)

所以我将我的标签分配给变量'Labs'。然后我将我的数据放入另一个变量“Industry”。一旦我绘制数据并传入标签，我得到了我需要的集群的图表。该图表垂直使用标签.....但

我不知道如何将此图表翻转为水平并保留标签名称。我尝试使用as.dendrogram函数，这允许我使用horiz=true，但我无法保留我的标签，因为它恢复为1,2,3等。

任何人都可以向我解释我如何能够自己纠正吗？我习惯使用Statistica，我没有任何问题进行层次聚类，我试图拿起R.我觉得分配标签应该非常容易，但我只是不知道如何。

我尝试使用下面的内容，但图表标注错误（ABC订单）。

F.Industries <- as.dendrogram(H.Industry)
labels(F.Industries) <- paste(as.character(Cluster_D[,1]))
plot(F.Industries, horiz = TRUE)

Answer 1

按照PAR的要求：

数据 - 我又添加了一列IBM：

z <- read.table(text = "Company Industry1 Industry2 Industry3
Google     3%        5%        6%
Apple      2%        6%        1%
IBM        7%        4%        2%", header = T)

当我尝试：

scale(round(z[, -1], 3))
#output
Error in Math.data.frame(list(Industry1 = c(2L, 1L, 3L), Industry2 = c(2L,  : 
  non-numeric variable in data frame: Industry1Industry2Industry3

表示您提供的样本数据不代表您自己的样本数据。

转换为数字：

z = data.frame("Company" = z[,1], apply(z[,-1], 2, function(x) as.numeric(gsub("%", "", x))))

行名称是叶子的标签

rownames(z) <- z[,1]

D.Industry <- dist(scale(z[, -1]), method = "euclidean")
H.Industry <- hclust(D.Industry, method = "ward.D")

plot(as.dendrogram(H.Industry), horiz = T)

可以使用mar

调整边距

par(mar=c(2, 0, 0, 8))
plot(as.dendrogram(H.Industry), horiz = T)

其他方法包括使用ape和ggdendro

R

1 个答案: