Question

想象一下我有这个数据集data1（后缩放和PCA）。

            F1    F2     F3  F4   F5 ... F21
1          0.28  2.29  5.64 1.04 3.92    1065
2          0.26  1.28  4.38 1.05 3.40    1050
3          0.30  2.81  5.68 1.03 3.17    1185
4          0.24  2.18  7.80 0.86 3.45    1480
5          0.39  1.82  4.32 1.04 2.93     735
.
.
.
1000       0.34  1.97  6.75 1.05 2.85    1450

我使用以下方法对数据集进行了k均值聚类分析：

Clusters <- kmeans(data1, 5, nstart = 25)
data1 <- data.frame(data1)
data1 <- data1 %>% mutate(Cluster = Clusters$cluster)

然后我从另一个数据集中插入了字符类型的行名：

rownames(data1) <- data2$Name

然后，为避免标签重叠，我用

p1 <- autoplot(Clusters, data = data1, frame = TRUE, label = F, x=1, y=2)
p2 <- p1 + geom_text_repel(aes(label = rownames(data1)))

由于我有500多个数据点，因此标签可预测地重叠并且无法读取。我想知道是否只有当您单击或将鼠标悬停在数据点上时才显示标签时才可以使用标签的方法。或者欢迎任何其他解决方案。谢谢

Answer 1

您可以使用plotly进行操作。使用add_trace()，您只能在悬停时显示标签。

library(ggplot2)
library(plotly)
library(dplyr)

data <- mtcars[, c("mpg", "wt")] # just two features

k_data <- kmeans(data, 3) # find clusters

现在，我们将集群信息添加到data：

data <- cbind(data, cluster=k_data$cluster)
#                mpg    wt cluster
# Mazda RX4     21.0 2.620       1
# Mazda RX4 Wag 21.0 2.875       1
# Datsun 710    22.8 2.320       1

现在我们可以绘制所有内容：

plot_ly(data = data, x = ~mpg, y = ~wt, color = ~as.factor(cluster)) %>% 
  add_trace(
    type = 'scatter',
    mode = 'markers',
    text = rownames(data), # when you hover on a point it will show it's rowname
    hoverinfo = 'text',
    showlegend = F
  )

更多指南here和here。

是否只有在单击时才可以在图中显示标签？

1 个答案: