饼图标签不能仅针对某些方面正确显示

时间:2018-02-04 02:44:28

标签: r ggplot2 pie-chart tidyverse

我正在尝试创建一个包含多个方面的饼图,每个方面都有自己的百分比标签。它似乎适用于某些数据集,而不适用于其他数据集。下面是一个完全可重现的数据集示例,其中标签位置适用于一个方面(cyl: 4),但不适用于其他方面(cyl: 6cyl: 8)。例如,如果您查看cyl: 6,则57%的切片标有43%的标签,反之亦然。

任何人都知道这种行为的起源是什么?如何摆脱它?

library(dplyr)
library(datasets)
library(ggplot2)
data(mtcars)

# creating a dataframe
df <- dplyr::group_by(mtcars, .dots = c('cyl', 'am')) %>%
  dplyr::summarize(counts = n()) %>%
  dplyr::mutate(perc = (counts / sum(counts)) * 100) %>%
  dplyr::arrange(desc(perc))

# preparing the plot
ggplot2::ggplot(df, aes('', counts)) +
  geom_col(
    position = 'fill',
    color = 'black',
    width = 1,
    aes(fill = factor(am))
  ) +
  facet_wrap(~cyl, labeller = "label_both") +
  geom_label(
    aes(label = paste0(round(perc), "%"), group = factor(am)),
    position = position_fill(vjust = 0.5),
    color = 'black',
    size = 5,
    show.legend = FALSE
  ) +
  coord_polar(theta = "y")

reprex package创建于2018-02-03(v0.1.1.9000)。

2 个答案:

答案 0 :(得分:1)

这是我运行代码时得到的结果:

enter image description here

检查ggplot2是否已更新。

我正在运行ggplot2_2.2.1和R版本3.4.3

答案 1 :(得分:1)

更新:截至发布为ggplot2 2.3.0的最新ggplot2代码库,此问题似乎已修复。为了存档目的,我的旧答案保留在下面。

library(dplyr)
library(datasets)
library(ggplot2)
data(mtcars)

# creating a dataframe
df <- dplyr::group_by(mtcars, .dots = c('cyl', 'am')) %>%
  dplyr::summarize(counts = n()) %>%
  dplyr::mutate(perc = (counts / sum(counts)) * 100) %>%
  dplyr::arrange(desc(perc))

# preparing the plot
ggplot2::ggplot(df, aes('', counts)) +
  geom_col(
    position = 'fill',
    color = 'black',
    width = 1,
    aes(fill = factor(am))
  ) +
  facet_wrap(~cyl, labeller = "label_both") +
  geom_label(
    aes(label = paste0(round(perc), "%"), group = factor(am)),
    position = position_fill(vjust = 0.5),
    color = 'black',
    size = 5,
    show.legend = FALSE
  ) +
  coord_polar(theta = "y")

reprex package(v0.2.0)创建于2018-05-13。

devtools::session_info()
#> Session info -------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.5.0 (2018-04-23)
#>  system   x86_64, darwin15.6.0        
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  tz       America/Chicago             
#>  date     2018-05-13
#> Packages -----------------------------------------------------------------
#>  package    * version    date       source                          
#>  assertthat   0.2.0      2017-04-11 CRAN (R 3.5.0)                  
#>  backports    1.1.2      2017-12-13 CRAN (R 3.5.0)                  
#>  base       * 3.5.0      2018-04-24 local                           
#>  bindr        0.1.1      2018-03-13 CRAN (R 3.5.0)                  
#>  bindrcpp   * 0.2        2017-06-17 CRAN (R 3.5.0)                  
#>  colorspace   1.4-0      2017-12-23 R-Forge (R 3.5.0)               
#>  compiler     3.5.0      2018-04-24 local                           
#>  curl         3.1        2017-12-12 CRAN (R 3.5.0)                  
#>  datasets   * 3.5.0      2018-04-24 local                           
#>  devtools     1.13.5     2018-02-18 CRAN (R 3.5.0)                  
#>  digest       0.6.15     2018-01-28 CRAN (R 3.5.0)                  
#>  dplyr      * 0.7.4      2017-09-28 CRAN (R 3.5.0)                  
#>  evaluate     0.10.1     2017-06-24 CRAN (R 3.5.0)                  
#>  ggplot2    * 2.2.1.9000 2018-05-12 local                           
#>  glue         1.2.0      2017-10-29 CRAN (R 3.5.0)                  
#>  graphics   * 3.5.0      2018-04-24 local                           
#>  grDevices  * 3.5.0      2018-04-24 local                           
#>  grid         3.5.0      2018-04-24 local                           
#>  gtable       0.2.0      2016-02-26 CRAN (R 3.5.0)                  
#>  htmltools    0.3.6      2017-04-28 CRAN (R 3.5.0)                  
#>  httr         1.3.1      2017-08-20 CRAN (R 3.5.0)                  
#>  knitr        1.20       2018-02-20 CRAN (R 3.5.0)                  
#>  labeling     0.3        2014-08-23 CRAN (R 3.5.0)                  
#>  lazyeval     0.2.1      2017-10-29 CRAN (R 3.5.0)                  
#>  magrittr     1.5        2014-11-22 CRAN (R 3.5.0)                  
#>  memoise      1.1.0      2017-04-21 CRAN (R 3.5.0)                  
#>  methods    * 3.5.0      2018-04-24 local                           
#>  mime         0.5        2016-07-07 CRAN (R 3.5.0)                  
#>  munsell      0.4.3      2016-02-13 CRAN (R 3.5.0)                  
#>  pillar       1.2.1      2018-02-27 CRAN (R 3.5.0)                  
#>  pkgconfig    2.0.1      2017-03-21 CRAN (R 3.5.0)                  
#>  plyr         1.8.4      2016-06-08 CRAN (R 3.5.0)                  
#>  R6           2.2.2      2017-06-17 CRAN (R 3.5.0)                  
#>  Rcpp         0.12.16    2018-03-13 CRAN (R 3.5.0)                  
#>  rlang        0.2.0.9001 2018-05-10 Github (r-lib/rlang@ccdbd8b)    
#>  rmarkdown    1.9        2018-03-01 CRAN (R 3.5.0)                  
#>  rprojroot    1.3-2      2018-01-03 CRAN (R 3.5.0)                  
#>  scales       0.5.0.9000 2018-04-10 Github (hadley/scales@d767915)  
#>  stats      * 3.5.0      2018-04-24 local                           
#>  stringi      1.1.7      2018-03-12 CRAN (R 3.5.0)                  
#>  stringr      1.3.0      2018-02-19 CRAN (R 3.5.0)                  
#>  tibble       1.4.2      2018-01-22 CRAN (R 3.5.0)                  
#>  tools        3.5.0      2018-04-24 local                           
#>  utils      * 3.5.0      2018-04-24 local                           
#>  withr        2.1.2      2018-05-10 Github (jimhester/withr@79d7b0d)
#>  xml2         1.2.0      2018-01-24 CRAN (R 3.5.0)                  
#>  yaml         2.1.18     2018-03-08 CRAN (R 3.5.0)

旧答案

似乎 ggplot2_2.2.1.9000 中的geom_col()关注数据框中数据的顺序。这有效:

library(dplyr)
library(datasets)
library(ggplot2)
data(mtcars)

# creating a dataframe
df <- dplyr::group_by(mtcars, .dots = c('cyl', 'am')) %>%
  dplyr::summarize(counts = n()) %>%
  dplyr::mutate(perc = (counts / sum(counts)) * 100) %>%
  dplyr::arrange(cyl, desc(am)) # change in the code is here, I'm sorting by cyl and am, not by perc

# preparing the plot
ggplot2::ggplot(df, aes('', counts)) +
  geom_col(
    position = 'fill',
    color = 'black',
    width = 1,
    aes(fill = factor(am))
  ) +
  facet_wrap(~cyl, labeller = "label_both") +
  geom_label(
    aes(label = paste0(round(perc), "%"), group = factor(am)),
    position = position_fill(vjust = 0.5),
    color = 'black',
    size = 5,
    show.legend = FALSE
  ) +
  coord_polar(theta = "y")

enter image description here

> sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] bindrcpp_0.2       ggplot2_2.2.1.9000 dplyr_0.7.4