如何在R中绘制三组比较数据

时间:2016-08-25 05:40:27

标签: r ggplot2

我有一个名为mydf的数据框,其中我有列Gene_symbol和三个不同的列(癌症),AMLCLLMDS。我想绘制这些癌症中每个基因的百分比。在情节中表现这种情况的好方法是什么?

mydf <- structure(list(GENE_SYMBOL = c("NPM1", "DNMT3A", "TET2", "IDH1", 
"IDH2"), AML = c("28.00%", "24.00%", "8.00%", "9.00%", "10.00%"
), CLL = c("0.00%", "8.00%", "0.00%", "3.00%", "1.00%"), MDS = c("7.00%", 
"28.00%", "7.00%", "10.00%", "3.00%")), .Names = c("GENE_SYMBOL", 
"AML", "CLL", "MDS"), row.names = c(NA, 5L), class = "data.frame")

1 个答案:

答案 0 :(得分:1)

我们可以通过循环显示列,从{%}列中删除barplotbase R尝试%sub删除% ,并转换为numeric

mydf[-1] <- lapply(mydf[-1], function(x) as.numeric(sub("[%]", "", x)) )
barplot(`row.names<-`(as.matrix(mydf[-1]), mydf$GENE_SYMBOL), beside=TRUE,
            legend = TRUE, col = c("red", "green", "blue", "yellow"))

如果我们想要&#39; GENE_SYMBOL&#39;在x轴

barplot(t(`row.names<-`(mydf[-1], mydf$GENE_SYMBOL)), beside=TRUE, 
              legend = TRUE, col = c("red", "green", "blue"))

如果我们使用ggplot

library(dplyr)
library(tidyr)
library(ggplot2)
gather(mydf, Var, Val, -GENE_SYMBOL) %>% 
     mutate(Val = as.numeric(sub("[%]", "", Val))) %>% 
     ggplot(., aes(x= GENE_SYMBOL, y = Val)) + 
                    geom_bar(aes(fill = Var), position = "dodge", stat="identity")

enter image description here