创建箱形图时,是否只有一种聪明的方法仅保留n个最大的组(计数)?
save()
这是情节。我想例如仅保留上表中的前5个制造商。
library(tidyverse)
head(mpg)
# A tibble: 6 x 11
manufacturer model displ year cyl trans drv cty hwy fl class
<chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
1 audi a4 1.8 1999 4 auto(l5) f 18 29 p compa~
2 audi a4 1.8 1999 4 manual(m5) f 21 29 p compa~
3 audi a4 2 2008 4 manual(m6) f 20 31 p compa~
4 audi a4 2 2008 4 auto(av) f 21 30 p compa~
5 audi a4 2.8 1999 6 auto(l5) f 16 26 p compa~
6 audi a4 2.8 1999 6 manual(m5) f 18 26 p compa~
mpg %>%
count(manufacturer, sort=TRUE)
# A tibble: 15 x 2
manufacturer n
<chr> <int>
1 dodge 37
2 toyota 34
3 volkswagen 27
4 ford 25
5 chevrolet 19
6 audi 18
7 hyundai 14
8 subaru 14
9 nissan 13
10 honda 9
11 jeep 8
12 pontiac 5
13 land rover 4
14 mercury 4
15 lincoln 3
答案 0 :(得分:1)
您需要做的是在ggplot
调用之前提取N个需要的制成品并将它们传递给scale_y_discrete(limits = ...)
(limits
将对想要的变量进行子集化并仅绘制它们)。
library(tidyverse)
nWanted <- 5
foo <- head(count(mpg, manufacturer, sort = TRUE), nWanted)$manufacturer
# [1] "dodge" "toyota" "volkswagen" "ford" "chevrolet"
ggplot(mpg) +
geom_boxplot(aes(displ, manufacturer)) +
scale_y_discrete(limits = foo)
更正确的解决方案是(即,将分类变量传递到x轴,然后翻转坐标):
ggplot(mpg) +
geom_boxplot(aes(manufacturer, displ)) +
coord_flip() +
scale_x_discrete(limits = foo)