调整颜色热图

时间:2018-02-20 20:43:37

标签: r csv bioinformatics heatmap

我有一个8行和5列的CSV文件,如下所示:

,MUT,AB1,M86,MU0,MZ4
2pc0,9.3235,9.2234,8.5654,6.5688,6.0312
2hb4,7.4259,7.9193,7.0837,6.1959,9.6501
3ixo,9.1124,4.8244,9.2058,5.6194,4.8181
2i0d,10.1331,9.9726,1.7889,2.1879,1.0692
2q5k,10.7538,0.377,9.8693,1.5496,9.869
4djq,12.0394,2.4673,3.7014,10.8828,1.4023
2q55,10.7834,1.4322,5.3941,0.871,1.7253
2qi1,10.0908,10.7989,4.1154,2.3832,1.2894

我希望以下R脚本绘制我的集合的热图,其中值[0; 2]为绿色,[2; 3]为黄色,[3; maxvalue]为红色,颜色应该演变以连续的方式。

以下是我目前正在尝试使用的代码:

#########################################################
### A) Installing and loading required packages
#########################################################

if (!require("gplots")) {
   install.packages("gplots", dependencies = TRUE)
   library(gplots)
   }
if (!require("RColorBrewer")) {
   install.packages("RColorBrewer", dependencies = TRUE)
   library(RColorBrewer)
   }


#########################################################
### B) Reading in data and transform it into matrix format
#########################################################

data <- read.csv('/mypath/raw/raw.csv', comment.char="#")
rnames <- data[,1]                            # assign labels in column 1 to "rnames"
mat_data <- data.matrix(data[,2:ncol(data)])  # transform column 2-5 into a matrix
rownames(mat_data) <- rnames                  # assign row names


#########################################################
### C) Customizing and plotting the heat map
#########################################################

# creates a own color palette from red to green
my_palette <- colorRampPalette(c("green", "yellow", "red"))(n = 299)

# (optional) defines the color breaks manually for a "skewed" color transition
col_breaks = c(seq(0,2,length=200),  # for green
  seq(2,3,length=100),           # for yellow
  seq(3,15,length=1500))             # for red

# creates a 5 x 5 inch image
png("/mypath/raw/raw.png",    # create PNG for the heat map        
  width = 5*300,        # 5 x 300 pixels
  height = 5*300,
  res = 300,            # 300 pixels per inch
  pointsize = 8)        # smaller font size

heatmap.2(mat_data,
  cellnote = mat_data,  # same data set for cell labels
  main = "Correlation", # heat map title
  notecol="black",      # change font color of cell labels to black
  density.info="none",  # turns off density plot inside color legend
  trace="none",         # turns off trace lines inside the heat map
  margins =c(12,9),     # widens margins around plot
  col=my_palette,       # use on color palette defined earlier
  breaks=col_breaks,    # enable color transition at specified limits
  dendrogram="none",     # only draw a row dendrogram
  Colv="NA" )           # turn off column clustering

dev.off()               # close the PNG device

但是,当我使用我在互联网上找到的脚本时,会出现以下错误:

Error in image.default(1:nc, 1:nr, x, xlim = 0.5 + c(0, nc), ylim = 0.5 +  : 
  must have one more break than colour

如果您能指出我如何用默认的最大值替换15并调整颜色范围并相应地倾斜,我将非常感激。

进一步的问题:

我的目标也是重组输出信息。我会将列和行标题放在顶部和左侧。此外,是否可以跟踪以下情况框周围的轮廓为(x,y)=(4,1)(5,2)(6,3)(7,4)(8,5)

1 个答案:

答案 0 :(得分:1)

我不确定你究竟想要的颜色。如果你想连续 颜色渐变,你需要两种颜色的值&gt; 3(渐变应该在红色之间) 和其他颜色?)。基本上缺少一种颜色(我添加了“黄金”)。 你可能会很容易地调整下面的例子。

请注意,休息次数不应太高(不是你问题中的数千) 否则钥匙将完全变白。

另请注意,实际上不建议将绿色到红色渐变作为不可忽略的 人口比例对这些颜色是盲目的(更喜欢蓝色 - 红色或蓝色 - 绿色)。

据我所知,不可能放置列和行标题 在heatmap.2的顶部和左边距上。不可能画盒子。但是,您可以绘制水平和垂直线。

您可以查看允许更多控制的Bioconductor包ComplexHeatmap(包括绘图框和更改标签的位置)。

library(gplots)
#> 
#> Attachement du package : 'gplots'
#> The following object is masked from 'package:stats':
#> 
#>     lowess
data <- read.csv(text = ',MUT,AB1,M86,MU0,MZ4
2pc0,9.3235,9.2234,8.5654,6.5688,6.0312
2hb4,7.4259,7.9193,7.0837,6.1959,9.6501
3ixo,9.1124,4.8244,9.2058,5.6194,4.8181
2i0d,10.1331,9.9726,1.7889,2.1879,1.0692
2q5k,10.7538,0.377,9.8693,1.5496,9.869
4djq,12.0394,2.4673,3.7014,10.8828,1.4023
2q55,10.7834,1.4322,5.3941,0.871,1.7253
2qi1,10.0908,10.7989,4.1154,2.3832,1.2894', comment.char="#")

rnames <- data[,1]                            # assign labels in column 1 to "rnames"
mat_data <- data.matrix(data[,2:ncol(data)])  # transform column 2-5 into a matrix
rownames(mat_data) <- rnames                  # assign row names

# First define your breaks
col_breaks <- seq(0,max(mat_data), by = 0.1)

# Then define wich color gradient you want for between each values
# Green - red radient not recommended !!
# NB : this will work only if the maximum value is > 3
my_palette <- c(colorRampPalette(c("forestgreen", "yellow"))(20), 
                colorRampPalette(c("yellow", "gold"))(10),
                colorRampPalette(c("gold", "red"))(length(col_breaks)-31))

# x11(width = 10/2.54, height = 10/2.54)
mat_data <- round(mat_data,2) # probably better to round your values for easier reading

heatmap.2(mat_data,
          cellnote = mat_data,  # same data set for cell labels
          main = "Correlation", # heat map title
          notecol="black",      # change font color of cell labels to black
          density.info="none",  # turns off density plot inside color legend
          trace="none",         # turns off trace lines inside the heat map
          margins =c(4,4),      # widens margins around plot
          col=my_palette,       # use on color palette defined earlier
          breaks=col_breaks,    # enable color transition at specified limits
          dendrogram="none",    # only draw a row dendrogram
          Colv="NA",            # turn off column clustering

          # add horizontal and vertical lines (but no box...)
          colsep = 3,
          rowsep = 3,
          sepcolor = "black",

          # additional control of the presentation
          lhei = c(3,10),       # adapt the relative areas devoted to the matrix
          lwid = c(3,10), 
          cexRow = 1.2,
          cexCol = 1.2,
          key.title = "",
          key.par = list(mar = c(2,0.5,1.5,0.5), mgp = c(1, 0.5, 0))
          )           

reprex package(v0.2.0)创建于2018-02-25。