如何使用R可视化k-均值聚类?

时间:2018-10-12 14:52:55

标签: r bioinformatics k-means

如何为下面的经log2转换的数据集进行k-均值聚类,就像附加的图像一样。

我的样本df就像:

set.seed(5)
cnt_log2 = data.frame(replicate(6, runif(1000,0,20)), 1:10)
names(cnt_log2) = c(paste0("Col",1:6),"geneID")

Clustering genes counts image

1 个答案:

答案 0 :(得分:0)

我已经使用:

res_km <- kmeans(df, 5, nstart = 10)
data_plot <- data.table(melt(data.table(class = as.factor(res_km$cluster), df)))
data_plot[, Time := rep(1:ncol(df), each = nrow(df))]
data_plot[, ID := rep(1:nrow(df), ncol(df))]
head(data_plot)
# prepare centroids
centers <- data.table(melt(res_km$centers))
setnames(centers, c("Var1", "Var2"), c("class", "Time"))
centers[, ID := class]
centers[, gr := as.numeric(as.factor(Time))]
head(centers)
head(data_plot)
# plot the results
ggplot(data_plot, aes(variable, value, group = ID)) +
  facet_wrap(~class, ncol = 2, scales = "free_y") +
  geom_line(color = "grey10", alpha = 0.65) +

  geom_line(data = centers, aes(gr, value),
            color = "firebrick1", alpha = 0.80, size = 1.2) +
  labs(x = "Time", y = "Load (normalised)") +
  theme_bw()