Barplot绘制不同序列长度的DNA频率

时间:2017-09-06 20:52:32

标签: r ggplot2

我将此数据框df读作df<- read.table("WT1.txt", header= TRUE)。我想绘制每个长度值标记A C G T频率的直方图。有没有更好的方法来绘制这个?

df

 length      A      C      G       T
     17  95668  73186 162726  730847
     18 187013  88641 120631  334695
     19 146061 373719 152215  303973
     20 249897  73862 115441  343179
     21 219899  82356 109536  636704
     22 226368 101499 111974 1591106
     23 188187 112155  98002 1437280

2 个答案:

答案 0 :(得分:3)

您可以通过变量length将数据框格融合为长格式,并使用ggplot2绘制堆积条形图:

df <- read.table(text=
    "length      A      C      G       T
     17  95668  73186 162726  730847
     18 187013  88641 120631  334695
     19 146061 373719 152215  303973
     20 249897  73862 115441  343179
     21 219899  82356 109536  636704
     22 226368 101499 111974 1591106
     23 188187 112155  98002 1437280", header=T)
library(reshape2)
df <- melt(df, id.vars = "length")
library(ggplot2)
ggplot(df)+
  geom_bar(aes(x=length, y=value, fill=variable), stat="identity")

答案 1 :(得分:1)

使用dplyr计算每个基数的频率,并使用ggplot2绘制条形图。我更喜欢使用stat = "identity", position = "dodge"而不是stat = "identity",因为它可以更好地了解数据的样子。

library(tidyverse)

gather(df, Base, value, -length) %>%
    group_by(length) %>%
    mutate(frequency = value / sum(value)) %>%
    ggplot(aes(factor(length), y = frequency, fill = Base))+
        geom_bar(stat = "identity", position = "dodge",
                 color = "black", width = 0.6) +
        labs(x = "Base pairs", 
             y = "Frequency",
             fill = "Base") +
        scale_y_continuous(limits = c(0, 1)) +
        scale_fill_brewer(palette = "Set1") +
        theme_classic()

enter image description here