ggplot2:带有点和填充分离的箱形图

时间:2017-01-20 13:26:30

标签: r ggplot2 boxplot

我有一个可以通过两个分隔符分隔的数据。一个是年,二是场特征。

box<-as.data.frame(1:36)

box$year <- c(1996,1996,1996,1996,1996,1996,1996,1996,1996,
              1997,1997,1997,1997,1997,1997,1997,1997,1997,
              1996,1996,1996,1996,1996,1996,1996,1996,1996,
              1997,1997,1997,1997,1997,1997,1997,1997,1997)
box$year <- as.character(box$year)

box$case <- c(6.40,6.75,6.11,6.33,5.50,5.40,5.83,4.57,5.80,
              6.00,6.11,6.40,7.00,NA,5.44,6.00,  NA,6.00,
              6.00,6.20,6.40,6.64,6.33,6.60,7.14,6.89,7.10,
              6.73,6.27,6.64,6.41,6.42,6.17,6.05,5.89,5.82)

box$code <- c("L","L","L","L","L","L","L","L","L","L","L","L",
              "L","L","L","L","L","L","M","M","M","M","M","M",
              "M","M","M","M","M","M","M","M","M","M","M","M")

colour <- factor(box$code, labels = c("#F8766D", "#00BFC4"))

在箱图中,我想在它们上方显示点,以查看数据的分布方式。每年只需一个箱图即可轻松完成:

ggplot(box, aes(x = year, y = case, fill = "#F8766D")) +
  geom_boxplot(alpha = 0.80) +
  geom_point(colour = colour, size = 5) +
  theme(text = element_text(size = 18),
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        panel.grid.minor.x = element_blank(),
        panel.grid.major.x = element_blank(),
        legend.position = "none")

enter image description here

但是当我在其中添加填充参数时,它变得更加复杂:

ggplot(box, aes(x = year, y = case, fill = code)) +
  geom_boxplot(alpha = 0.80) +
  geom_point(colour = colour, size = 5) +
  theme(text = element_text(size = 18),
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        panel.grid.minor.x = element_blank(),
        panel.grid.major.x = element_blank(),
        legend.position = "none")

enter image description here

现在问题是:如何将这些点移动到它们所属的箱线图轴?蓝色指向蓝色框图,红色指向红色指示图。

2 个答案:

答案 0 :(得分:12)

如Henrik所说,使用position_jitterdodge()shape = 21。您也可以稍微清理一下代码:

  1. 无需定义框,然后逐件填写
  2. 如果您愿意,可以让ggplot散列颜色,并跳过构建颜色因子。如果您想更改默认值,请查看scale_fill_manualscale_color_manual

    box <- data.frame(year = c(1996,1996,1996,1996,1996,1996,1996,1996,1996,
                      1997,1997,1997,1997,1997,1997,1997,1997,1997,
                      1996,1996,1996,1996,1996,1996,1996,1996,1996,
                      1997,1997,1997,1997,1997,1997,1997,1997,1997),
                      case  = c(6.40,6.75,6.11,6.33,5.50,5.40,5.83,4.57,5.80,
                      6.00,6.11,6.40,7.00,NA,5.44,6.00,  NA,6.00,
                      6.00,6.20,6.40,6.64,6.33,6.60,7.14,6.89,7.10,
                      6.73,6.27,6.64,6.41,6.42,6.17,6.05,5.89,5.82),
                      code = c("L","L","L","L","L","L","L","L","L","L","L","L",
                      "L","L","L","L","L","L","M","M","M","M","M","M",
                      "M","M","M","M","M","M","M","M","M","M","M","M"))
    
    ggplot(box, aes(x = factor(year), y = case, fill = code)) +
      geom_boxplot(alpha = 0.80) +
      geom_point(aes(fill = code), size = 5, shape = 21, position = position_jitterdodge()) +
      theme(text = element_text(size = 18),
            axis.title.x = element_blank(),
            axis.title.y = element_blank(),
            panel.grid.minor.x = element_blank(),
            panel.grid.major.x = element_blank(),
            legend.position = "none")
    
  3. enter image description here

答案 1 :(得分:4)

我看到你已经接受了@JakeKaupp的好答案,但我想我会使用geom_dotplot提出不同的选择。您可视化的数据相当小,为什么不放弃箱线图?

ggplot(box, aes(x = factor(year), y = case, fill = code))+
    geom_dotplot(binaxis = 'y', stackdir = 'center',
                 position = position_dodge())

enter image description here