在R中操纵boxplot美学

时间:2017-04-21 14:18:44

标签: r ggplot2

我对boxplot的一些细节感到困惑。我正在使用以下数据(嗯,这是它的一个示例):

dput(birds[1:20,])
structure(list(status = c(1L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 
0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L), length = c(1520L, 
1250L, 870L, 720L, 820L, 770L, 50L, 570L, 580L, 480L, 470L, 450L, 
435L, 275L, 256L, 230L, 330L, 330L, 300L, 180L), mass = c(9600, 
5000, 3360, 2517, 3170, 4390, 1930, 1020, 910, 590, 539, 940, 
684, 230, 162, 170, 501, 439, 386, 95), range = c(1.21, 0.56, 
0.07, 1.1, 3.45, 2.96, 0.01, 9.01, 7.9, 4.33, 1.04, 2.17, 4.81, 
0.31, 0.24, 0.77, 2.23, 0.22, 2.4, 0.69), migr = c(1L, 1L, 1L, 
3L, 3L, 2L, 1L, 2L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 
2L), insect = c(12L, 0L, 0L, 12L, 0L, 0L, 0L, 6L, 6L, 0L, 12L, 
12L, 12L, 3L, 3L, 3L, 3L, 3L, 3L, 12L), diet = c(2L, 1L, 1L, 
2L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 
2L)), .Names = c("status", "length", "mass", "range", "migr", 
"insect", "diet"), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 22L
), class = "data.frame")

并创建了这样的情节:

give.n <- function(x){
  return(c(y = mean(x), label = length(x)))
}

plot2 <- ggplot(birds, aes(x = factor(diet, labels = c("herbivorous", "omnivorous","carnivorous")),
                           y = range, fill = factor(status,labels = c("absent", "present")))) + 
          geom_boxplot() + labs(x = "Diet", fill = "Status") +
          stat_summary(fun.data = give.n, geom = "text") +
          geom_jitter()
plot2

ggplot

这是我被困的地方:

  • 我希望抖动图使用不同的颜色,取决于migr是什么(注意:migr是1久坐不动,2久坐不动和迁移,3 - 迁移),理想情况下也会有传说。我尝试过添加+ geom_jitter(birds, aes (x = factor(diet)) - &gt; unsucessfull。
  • 如何将数字(观察次数)移至箱线图的中间位置。我尝试了position的不同变体,但也没有运气。

1 个答案:

答案 0 :(得分:1)

我建议更改ggplot函数之外的因子级别:

library(ggplot2)

df <- birds
df$diet <- factor(df$diet, levels = 1:3, labels = c("herbivorous", "omnivorous","carnivorous"))
df$status <- factor(df$status, levels = 0:1, labels = c("absent", "present"))
df$migr <- factor(df$migr, levels = 1:3, labels = c('sedentary', 'sedentary & migratory', 'migratory'))

give.n <- function(x){
    return(c(y = mean(x), label = length(x)))
}

ggplot(df, aes(x = diet, y = range, fill = status)) +
    geom_boxplot() + labs(x = "Diet", fill = "Status") +
    stat_summary(fun.data = give.n, geom = "text",
                 position = position_dodge(width = 0.75)) +
    geom_jitter(aes(color = migr)) +
    scale_color_brewer(palette = 'Set1')

enter image description here

由于我们同时设置了数字和抖动,如果每个框中的点对应于数字,那将是很好的。因此,我们必须通过状态告诉geom_jitter抖动:

ggplot(df, aes(x = diet, y = range, fill = status)) +
    geom_boxplot() + labs(x = "Diet", fill = "Status") +
    stat_summary(fun.data = give.n, geom = "text",
                 position = position_dodge(width = 0.75)) +
    geom_jitter(aes(color = migr, group = status),
                position = position_jitterdodge(dodge.width = 0.75)) +
    scale_color_brewer(palette = 'Set1')

enter image description here

如果您想更改抖动宽度,请更改jitter.width的参数position_jitterdodge