在geom_boxplot中保留n个最大的组

时间:2019-02-20 13:32:01

标签: r ggplot2 dplyr forcats

创建箱形图时,是否只有一种聪明的方法仅保留n个最大的组(计数)?

save()

这是情节。我想例如仅保留上表中的前5个制造商。

library(tidyverse)

head(mpg)

# A tibble: 6 x 11
  manufacturer model displ  year   cyl trans      drv     cty   hwy fl    class 
  <chr>        <chr> <dbl> <int> <int> <chr>      <chr> <int> <int> <chr> <chr> 
1 audi         a4      1.8  1999     4 auto(l5)   f        18    29 p     compa~
2 audi         a4      1.8  1999     4 manual(m5) f        21    29 p     compa~
3 audi         a4      2    2008     4 manual(m6) f        20    31 p     compa~
4 audi         a4      2    2008     4 auto(av)   f        21    30 p     compa~
5 audi         a4      2.8  1999     6 auto(l5)   f        16    26 p     compa~
6 audi         a4      2.8  1999     6 manual(m5) f        18    26 p     compa~

mpg %>% 
  count(manufacturer, sort=TRUE)

# A tibble: 15 x 2
   manufacturer     n
   <chr>        <int>
 1 dodge           37
 2 toyota          34
 3 volkswagen      27
 4 ford            25
 5 chevrolet       19
 6 audi            18
 7 hyundai         14
 8 subaru          14
 9 nissan          13
10 honda            9
11 jeep             8
12 pontiac          5
13 land rover       4
14 mercury          4
15 lincoln          3

enter image description here

1 个答案:

答案 0 :(得分:1)

您需要做的是在ggplot调用之前提取N个需要的制成品并将它们传递给scale_y_discrete(limits = ...)limits将对想要的变量进行子集化并仅绘制它们)。

library(tidyverse)

nWanted <- 5
foo <- head(count(mpg, manufacturer, sort = TRUE), nWanted)$manufacturer
# [1] "dodge"      "toyota"     "volkswagen" "ford"       "chevrolet"     

ggplot(mpg) +
    geom_boxplot(aes(displ, manufacturer)) +
    scale_y_discrete(limits = foo)

enter image description here

更正确的解决方案是(即,将分类变量传递到x轴,然后翻转坐标):

ggplot(mpg) +
    geom_boxplot(aes(manufacturer, displ)) +
    coord_flip() +
    scale_x_discrete(limits = foo)