请考虑以下事项:
library(tidyverse)
library(broom)
tidy.quants <- mtcars %>%
nest(-cyl) %>%
mutate(quantiles = map(data, ~ quantile(.$mpg))) %>%
unnest(map(quantiles, tidy))
tidy.quants
#> # A tibble: 15 x 3
#> cyl names x
#> <dbl> <chr> <dbl>
#> 1 6 0% 17.80
#> 2 6 25% 18.65
#> 3 6 50% 19.70
#> 4 6 75% 21.00
#> 5 6 100% 21.40
#> 6 4 0% 21.40
#> 7 4 25% 22.80
#> 8 4 50% 26.00
#> 9 4 75% 30.40
#> 10 4 100% 33.90
#> 11 8 0% 10.40
#> 12 8 25% 14.40
#> 13 8 50% 15.20
#> 14 8 75% 16.25
#> 15 8 100% 19.20
然而,在尝试传播(或传递给绘图)时,names
列以(有些)意外顺序返回时,这很棒且很整洁:
tidy.quants %>% spread(names, x)
#> # A tibble: 3 x 6
#> cyl `0%` `100%` `25%` `50%` `75%`
#> * <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 4 21.4 33.9 22.80 26.0 30.40
#> 2 6 17.8 21.4 18.65 19.7 21.00
#> 3 8 10.4 19.2 14.40 15.2 16.25
ggplot(tidy.quants, aes(x = names, y = x, color = factor(cyl))) +
geom_point()
是否有一种干净/惯用的方式让names
以预期的顺序返回?也就是说,0%, 25%, 50%, 75%, 100%
代替0%, 100%, 25%, 50%, 75%
?
答案 0 :(得分:1)
您可以尝试gtools::mixedsort
,它可以对带有嵌入数字的字符串进行排序;获取带mixedsort(unique(names))
的已排序标签后,类似于color
,您可以将names
(x轴变量)设为一个因子,其中排序值为等级,ggplot
应为能够以正确的顺序显示x轴标签:
library(gtools)
ggplot(tidy.quants, aes(x = factor(names, levels = mixedsort(unique(names))), y = x, color = factor(cyl))) +
geom_point() + xlab('names')
spread
的相似想法:
tidy.quants %>%
mutate(names = factor(names, mixedsort(unique(names)))) %>%
spread(names, x)
# A tibble: 3 x 6
# cyl `0%` `25%` `50%` `75%` `100%`
#* <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 4 21.4 22.80 26.0 30.40 33.9
#2 6 17.8 18.65 19.7 21.00 21.4
#3 8 10.4 14.40 15.2 16.25 19.2
答案 1 :(得分:1)
这是有效的,因为names
已按quantiles
排序:
tidy.quants <- mtcars %>%
nest(-cyl) %>%
mutate(quantiles = map(data, ~ quantile(.$mpg))) %>%
unnest(map(quantiles, tidy)) %>%
mutate(names=factor(names,unique(names)))
tidy.quants %>% spread(names, x)
<强>结果强>
# A tibble: 3 x 6
cyl `0%` `25%` `50%` `75%` `100%`
* <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 4 21.4 22.80 26.0 30.40 33.9
2 6 17.8 18.65 19.7 21.00 21.4
3 8 10.4 14.40 15.2 16.25 19.2