Question

我有一个大的data.frame（ncols = 500，nrows = 14000）。它看起来像这样：

          Sample1   Sample2   Sample3    .....    
Gene1       22         0        0.11     .....    
Gene2      0.112      0.1       0.4      .....     
Gene3      0.45        0        0.19     .....    
.....      .....     .....     .....     .....

我想在不应用任何统计数据的情况下绘制如此大量的数据，以便明确（简单地使用颜色或其他工具）在数量（数量）方面的差异。样品1的Gene1和Gene2之间等等。除了热图之外的任何想法？

Answer 1

如何使用ggplot2中的geom_raster？

#  Make up some data
set.seed(1)
df <- data.frame( matrix( runif(25) , 5 , 5 ) )
#         X1        X2         X3         X4         X5
#1 0.5316382 0.4360309 0.09576886 0.56497254 0.43930824
#2 0.2383700 0.1531009 0.71377161 0.39367645 0.42211072
#3 0.5009796 0.6549886 0.05996069 0.08236798 0.08574704
#4 0.1171437 0.8765644 0.29892712 0.06071803 0.78011966
#5 0.5066046 0.5486397 0.34770099 0.07785835 0.09659246

#  Abs difference between columns of dataframe
out <- data.frame( t( apply( df , 1 , function(x) abs( diff( x ) ) ) ) )

#  Plot using geom_raster    
require( ggplot2 )
require( reshape2 )
out.melt <- melt( out )
out.melt$y <- rep( 1:10,times = 9 )
p <- ggplot( out.melt , aes( variable , y , fill = value ) ) + geom_raster()
p

enter image description here

Answer 2

如果问题是关于真正庞大的数据（例如，当数据点的数量远远大于屏幕上的像素数量时），那么Bin-summarize-smooth如何：一个可视化大数据的框架，如这里描述http://vita.had.co.nz/papers/bigvis.html

@Article{bigvis,
  title = {Bin-summarise-smooth: a framework for visualising large data},
  author = {Hadley Wickham},
  year = {Submitted},
  journal = {Infovis 2013},
}

另请参阅此演示文稿http://files.meetup.com/2906882/visualising_big_data_in_R.pdf

（例如，第5张幻灯片）

大量的数据图形表示

2 个答案: