来自DESeq2的PCA上的组的省略号

时间:2017-11-23 19:38:44

标签: r ggplot2 pca ellipse

我想在下面的图中添加三个组中的省略号(基于变量"结果")。请注意,vsd是一个DESeq2对象,其中包含因子结果和批处理:

pcaData <- plotPCA(vsd, intgroup=c("outcome", "batch"), returnData=TRUE)
percentVar <- round(100 * attr(pcaData, "percentVar"))
ggplot(pcaData, aes(PC1, PC2, color=outcome, shape=batch)) +
  geom_point(size=3) +
  xlab(paste0("PC1: ",percentVar[1],"% variance")) +
  ylab(paste0("PC2: ",percentVar[2],"% variance")) + 
  geom_text(aes(label=rownames(coldata_WM_D56C)),hjust=.5, vjust=-.8, size=3) +
  geom_density2d(alpha=.5) +
  coord_fixed()

PCA

我尝试添加一个椭圆,认为它会从顶部继承美学,但它试图为每个点制作一个椭圆。

stat_ellipse() +
  

计算椭圆的点数太少

     

geom_path:每组只包含一个观察。你需要调整群体审美吗?

     

stat_density2d()中的计算失败:缺少需要TRUE / FALSE的值

连连呢?提前谢谢。

> dput(pcaData)
structure(list(PC1 = c(-15.646673151638, -4.21111051849254, 13.1215703467274, 
-6.5477433859415, -3.22129766721873, 4.59321517871152, 1.84089686598042, 
37.8415172383233, 40.9996810499267, 37.6089348653721, -24.5520575763498, 
-46.5840253031228, -4.01498554781508, -31.227922394463), PC2 = c(31.2712754127142, 
5.89621557021357, -10.2425538634254, -3.44497747426626, 2.21504480008043, 
0.315695833259479, -4.66467589267529, -4.27504355920903, -1.08666029542243, 
-2.69753368235982, 5.89767436709778, -24.2836532766506, 4.43980653642228, 
0.659385524221137), group = structure(c(4L, 5L, 6L, 7L, 8L, 5L, 
8L, 1L, 2L, 3L, 6L, 9L, 9L, 9L), .Label = c("ctrl : 1", "ctrl : 2", 
"ctrl : 3", "non : 1", "non : 2", "non : 3", "preg : 1", "preg : 2", 
"preg : 3"), class = "factor"), outcome = structure(c(2L, 2L, 
2L, 1L, 1L, 2L, 1L, 3L, 3L, 3L, 2L, 1L, 1L, 1L), .Label = c("preg", 
"non", "ctrl"), class = "factor"), batch = structure(c(1L, 2L, 
3L, 1L, 2L, 2L, 2L, 1L, 2L, 3L, 3L, 3L, 3L, 3L), .Label = c("1", 
"2", "3"), class = "factor"), name = structure(1:14, .Label = c("D5-R-N-1", 
"D5-R-N-2", "D5-R-N-3", "D5-R-P-1", "D5-R-P-2", "D5-Z-N-1", "D5-Z-P-1", 
"D6-C-T-1", "D6-C-T-2", "D6-C-T-3", "D6-Z-N-1", "D6-Z-P-1", "D6-Z-P-2", 
"D6-Z-P-3"), class = "factor")), .Names = c("PC1", "PC2", "group", 
"outcome", "batch", "name"), row.names = c("D5-R-N-1", "D5-R-N-2", 
"D5-R-N-3", "D5-R-P-1", "D5-R-P-2", "D5-Z-N-1", "D5-Z-P-1", "D6-C-T-1", 
"D6-C-T-2", "D6-C-T-3", "D6-Z-N-1", "D6-Z-P-1", "D6-Z-P-2", "D6-Z-P-3"
), class = "data.frame", percentVar = c(0.47709343625754, 0.0990361123451665
))

正如Maurits Evers所说,我已经添加了一组aes,它只为3种结果类型中的2种绘制了省略号。 enter image description here

1 个答案:

答案 0 :(得分:2)

由于您未提供任何样本数据,因此以下是使用faithful数据的示例。

关键是要添加group美学。

require(ggplot2);

# Generate sample data
df <- faithful[1:10, ];
df$batch <- as.factor(rep(1:5, each = 2));

# This will throw a similar error/warning to yours
#ggplot(df, aes(waiting, eruptions, color = eruptions > 3, shape = batch)) + geom_point() + stat_ellipse();

# Add a group aesthetic and it works
ggplot(df, aes(waiting, eruptions, color = eruptions > 3, shape = batch, group = eruptions > 3)) + geom_point() + stat_ellipse();

enter image description here

因此,在您的情况下,请尝试添加aes(..., group = outcome)