ggplot2 - 为个体创建堆积的比例直方图,并按人口分开

时间:2014-04-09 00:00:16

标签: r ggplot2 histogram

基本上,我有一个数据集,其中我有4列包含以下信息:个体(“Ind”),这些个体所属的地理种群(“Pop”),其基因组属于cluster1的比例以及属于cluster2的基因组比例(最后两个加起来为1)。

示例:

    Ind <- c(1:20)
    Pop <- rep(1:2, each = 10)
    set.seed(234)
    Cluster1 <- runif(20, 0.0, 1.0)
    Cluster2 <- 1-Cluster1
    df <- data.frame(Ind, Pop, Cluster1, Cluster2)

数据:

    Ind Pop    Cluster1   Cluster2
 1    1   1 0.745619998 0.25438000
 2    2   1 0.781712425 0.21828758
 3    3   1 0.020037114 0.97996289
 4    4   1 0.776085387 0.22391461
 5    5   1 0.066910093 0.93308991
 6    6   1 0.644795124 0.35520488
 7    7   1 0.929385959 0.07061404
 8    8   1 0.717642189 0.28235781
 9    9   1 0.927736510 0.07226349
 10  10   1 0.284230120 0.71576988
 11  11   2 0.555724930 0.44427507
 12  12   2 0.547701653 0.45229835
 13  13   2 0.582847855 0.41715215
 14  14   2 0.582989913 0.41701009
 15  15   2 0.001198341 0.99880166
 16  16   2 0.441117854 0.55888215
 17  17   2 0.313152501 0.68684750
 18  18   2 0.740014466 0.25998553
 19  19   2 0.138326844 0.86167316
 20  20   2 0.871777777 0.12822222

我想尝试使用类似this图中“A”面板的ggplot2制作一个情节。在该图中,每个个体是具有每个簇的比例的条形,但是x个标记是群体,并且垂直网格将这些群体分开。我知道如果忽略Pop并使用melt(),我可以轻松生成堆积直方图。但我想知道如何合并Pop来制作优雅优雅的情节,例如上面链接中的情节。

谢谢!

1 个答案:

答案 0 :(得分:1)

如何将IndPop作为id变量融合并使用facet_grid绘制图表?它不是100%喜欢你正在寻找的情节,而是通过几个主题调整得到非常接近:

dfm <- melt(df, id = c("Ind", "Pop"))
ggplot(dfm, aes(Ind, value, fill = variable)) + 
    geom_bar(stat="identity", width = 1) + 
    facet_grid(~Pop, scales = "free_x") + 
    scale_y_continuous(name = "", expand = c(0, 0)) + 
    scale_x_continuous(name = "", expand = c(0, 0), breaks = dfm$Ind) + 
    theme(
        panel.border = element_rect(colour = "black", size = 1, fill = NA),
        strip.background = element_rect(colour = "black", size = 1),
        panel.margin = unit(0, "cm"),
        axis.text.x = element_blank()
    )

ggplot example

更新:我的例子未能涵盖多个人群数量不均的更复杂的案例。使用spaces = "free_x"属性快速修改处理此案例,完整代码例如:

require(ggplot2)
require(reshape2)
require(grid)

Ind <- c(1:30)
Pop <- rep(paste("Pop", 1:3), times = c(5, 15, 10))
set.seed(234)
Cluster1 <- runif(30, 0.0, 1.0)
Cluster2 <- 1-Cluster1
df <- data.frame(Ind, Pop, Cluster1, Cluster2)

dfm <- melt(df, id = c("Ind", "Pop"))
ggplot(dfm, aes(Ind, value, fill = variable)) + 
    geom_bar(stat="identity", width = 1) + 
    facet_grid(~Pop, scales = "free_x", space = "free_x") + 
    scale_y_continuous(name = "", expand = c(0, 0)) + 
    scale_x_continuous(name = "", expand = c(0, 0), breaks = dfm$Ind) + 
    theme(
        panel.border = element_rect(colour = "black", size = 1, fill = NA),
        strip.background = element_rect(colour = "black", size = 1),
        panel.margin = unit(0, "cm"),
        axis.text.x = element_blank()
    )

ggplot example2