部分行标签热图-R

时间:2018-10-01 21:32:36

标签: r annotations label heatmap pheatmap

我想知道是否有人知道允许对热图进行部分行标记的软件包。我目前正在使用pheatmap()构造我的热图,但是我可以使用任何具有此功能的软件包。

我有很多行差异表达基因的图,我想标记它们的一个子集。 (我可以想到)有两件事要考虑:

  • 文本注释的位置取决于行的高度。如果行太窄,则文本标签将不明确,而没有某种指针。
  • 如果多个相邻行很重要(即将被标记),则这些行需要偏移,并且再次需要一个指针。

下面是部分解决方案的示例,它实际上只能到达一半,但我希望能说明我想做的事情。

set.seed(1)
require(pheatmap)
require(RColorBrewer)
require(grid)

### Data to plot
data_mat <- matrix(sample(1:10000, 300), nrow = 50, ncol = 6)
rownames(data_mat) <- paste0("Gene", 1:50)
colnames(data_mat) <- c(paste0("A", 1:3), paste0("B", 1:3))

### Set how many genes to annotate
  ### TRUE - make enough labels that some overlap
  ### FALSE - no overlap
tooMany <- T

### Select a few genes to annotate
if (tooMany) {
  sigGenes_v <- paste0("Gene", c(5,20,26,42,47,16,28))
  newMain_v <- "Too Many Labels"
} else {
  sigGenes_v <- paste0("Gene", c(5,20,26,42))
  newMain_v <- "OK Labels"
}

### Make color list
colors_v <- brewer.pal(8, "Dark2")
colors_v <- colors_v[c(1:length(sigGenes_v), 8)]
names(colors_v) <- c(sigGenes_v, "No")
annColors_lsv <- list("Sig" = colors_v)

### Column Metadata
colMeta_df <- data.frame(Treatment = c(rep("A", 3), rep("B", 3)),
                      Replicate = c(rep(1:3, 2)),
                      stringsAsFactors = F, 
                      row.names = colnames(data_mat))

### Row metadata
rowMeta_df <- data.frame(Sig = rep("No", 50), 
                      stringsAsFactors = F,
                      row.names = rownames(data_mat))
for (gene_v in sigGenes_v) rowMeta_df[rownames(rowMeta_df) == gene_v, "Sig"] <- gene_v

### Heatmap
heat <- pheatmap(data_mat,
                 annotation_row = rowMeta_df,
                 annotation_col = colMeta_df,
                 annotation_colors = annColors_lsv,
                 cellwidth = 10,
                 main = "Original Heat")

### Get order of genes after clustering
genesInHeatOrder_v <- heat$tree_row$labels[heat$tree_row$order]
whichSigInHeatOrder_v <- which(genesInHeatOrder_v %in% sigGenes_v)
whichSigInHeatOrderLabels_v <- genesInHeatOrder_v[whichSigInHeatOrder_v]

sigY <- 1 - (0.02 * whichSigInHeatOrder_v)

### Change title
whichMainGrob_v <- which(heat$gtable$layout$name == "main")
heat$gtable$grobs[[whichMainGrob_v]] <- textGrob(label = newMain_v, 
                                                 gp = gpar(fontsize = 16))

### Remove rows
whichRowGrob_v <- which(heat$gtable$layout$name == "row_names")
heat$gtable$grobs[[whichRowGrob_v]] <- textGrob(label = whichSigInHeatOrderLabels_v,
                                                y = sigY,
                                                vjust = 1)
grid.newpage()
grid.draw(heat)

以下是一些输出:

原始热图: original heatmap

确定标签: ok labels

好的标签,带有标志: ok labels, with flags

标签太多 too many labels

带有标签的标签过多 too many labels, with flags

“带有标志”输出是所需的最终结果。
我只是将这些保存为Rstudio图查看器中的图像。我知道我可以将它们另存为pdf,并提供更大的文件大小来消除标签重叠,但是单个单元格会比我想要的大。

1 个答案:

答案 0 :(得分:0)

根据您的代码,您似乎对gtables和grobs很满意。一种(相对)直接实现所需外观的方法是放大行标签grob,并在其中进行一些更改:

  1. 用“”替换不需要的标签;
  2. 在可用空间内均匀分布标签;
  3. 添加连接新旧标签位置的线段。

我为此编写了一个包装函数,其工作方式如下:

# heat refers to the original heatmap produced from the pheatmap() function
# kept.labels should be a vector of labels you wish to show
# repel.degree is a number in the range [0, 1], controlling how much the
# labels are spread out from one another

add.flag(heat,
         kept.labels = sigGenes_v,
         repel.degree = 0)

add.flag(heat,
         kept.labels = sigGenes_v,
         repel.degree = 0.5)

add.flag(heat,
         kept.labels = sigGenes_v,
         repel.degree = 1)

plot

功能(注释中的说明):

add.flag <- function(pheatmap,
                     kept.labels,
                     repel.degree) {

  # repel.degree = number within [0, 1], which controls how much 
  #                space to allocate for repelling labels.
  ## repel.degree = 0: spread out labels over existing range of kept labels
  ## repel.degree = 1: spread out labels over the full y-axis

  heatmap <- pheatmap$gtable

  new.label <- heatmap$grobs[[which(heatmap$layout$name == "row_names")]] 

  # keep only labels in kept.labels, replace the rest with ""
  new.label$label <- ifelse(new.label$label %in% kept.labels, 
                            new.label$label, "")

  # calculate evenly spaced out y-axis positions
  repelled.y <- function(d, d.select, k = repel.degree){
    # d = vector of distances for labels
    # d.select = vector of T/F for which labels are significant

    # recursive function to get current label positions
    # (note the unit is "npc" for all components of each distance)
    strip.npc <- function(dd){
      if(!"unit.arithmetic" %in% class(dd)) {
        return(as.numeric(dd))
      }

      d1 <- strip.npc(dd$arg1)
      d2 <- strip.npc(dd$arg2)
      fn <- dd$fname
      return(lazyeval::lazy_eval(paste(d1, fn, d2)))
    }

    full.range <- sapply(seq_along(d), function(i) strip.npc(d[i]))
    selected.range <- sapply(seq_along(d[d.select]), function(i) strip.npc(d[d.select][i]))

    return(unit(seq(from = max(selected.range) + k*(max(full.range) - max(selected.range)),
                    to = min(selected.range) - k*(min(selected.range) - min(full.range)), 
                    length.out = sum(d.select)), 
                "npc"))
  }
  new.y.positions <- repelled.y(new.label$y,
                                d.select = new.label$label != "")
  new.flag <- segmentsGrob(x0 = new.label$x,
                           x1 = new.label$x + unit(0.15, "npc"),
                           y0 = new.label$y[new.label$label != ""],
                           y1 = new.y.positions)

  # shift position for selected labels
  new.label$x <- new.label$x + unit(0.2, "npc")
  new.label$y[new.label$label != ""] <- new.y.positions

  # add flag to heatmap
  heatmap <- gtable::gtable_add_grob(x = heatmap,
                                   grobs = new.flag,
                                   t = 4, 
                                   l = 4
  )

  # replace label positions in heatmap
  heatmap$grobs[[which(heatmap$layout$name == "row_names")]] <- new.label

  # plot result
  grid.newpage()
  grid.draw(heatmap)

  # return a copy of the heatmap invisibly
  invisible(heatmap)
}