我有一个名为mydf
的数据框,其中我有列Gene_symbol
和三个不同的列(癌症),AML
,CLL
,MDS
。我想绘制这些癌症中每个基因的百分比。在情节中表现这种情况的好方法是什么?
mydf <- structure(list(GENE_SYMBOL = c("NPM1", "DNMT3A", "TET2", "IDH1",
"IDH2"), AML = c("28.00%", "24.00%", "8.00%", "9.00%", "10.00%"
), CLL = c("0.00%", "8.00%", "0.00%", "3.00%", "1.00%"), MDS = c("7.00%",
"28.00%", "7.00%", "10.00%", "3.00%")), .Names = c("GENE_SYMBOL",
"AML", "CLL", "MDS"), row.names = c(NA, 5L), class = "data.frame")
答案 0 :(得分:1)
我们可以通过循环显示列,从{%}列中删除barplot
后base R
尝试%
,sub
删除%
,并转换为numeric
。
mydf[-1] <- lapply(mydf[-1], function(x) as.numeric(sub("[%]", "", x)) )
barplot(`row.names<-`(as.matrix(mydf[-1]), mydf$GENE_SYMBOL), beside=TRUE,
legend = TRUE, col = c("red", "green", "blue", "yellow"))
如果我们想要&#39; GENE_SYMBOL&#39;在x轴
barplot(t(`row.names<-`(mydf[-1], mydf$GENE_SYMBOL)), beside=TRUE,
legend = TRUE, col = c("red", "green", "blue"))
如果我们使用ggplot
library(dplyr)
library(tidyr)
library(ggplot2)
gather(mydf, Var, Val, -GENE_SYMBOL) %>%
mutate(Val = as.numeric(sub("[%]", "", Val))) %>%
ggplot(., aes(x= GENE_SYMBOL, y = Val)) +
geom_bar(aes(fill = Var), position = "dodge", stat="identity")