散点图:如何在连接饼图的线上添加注释以标记饼图之间的y值变化百分比

时间:2020-07-24 11:30:56

标签: r ggplot2 scatterpie

我有一个散布图,在x和y轴上绘制了饼图,并连接了一条“趋势线”。本着this answer的精神,我想在每行上添加一个注释,以标记每个相邻派的y值之间的百分比增加/减少。

我的数据

library(tidyverse)
library(scatterpie)

my_df <- structure(list(day_in_july = 13:20, yes_and_yes = c(0.611814345991561, 
0.574750830564784, 0.593323216995448, 0.610539845758355, 0.650602409638554, 
0.57429718875502, 0.575971731448763, 0.545454545454545), yes_but_no = c(0.388185654008439, 
0.425249169435216, 0.406676783004552, 0.389460154241645, 0.349397590361446, 
0.42570281124498, 0.424028268551237, 0.454545454545455), y = c(0.388185654008439, 
0.425249169435216, 0.406676783004552, 0.389460154241645, 0.349397590361446, 
0.42570281124498, 0.424028268551237, 0.454545454545455)), row.names = c(NA, 
-8L), class = c("tbl_df", "tbl", "data.frame"))

我当前的可视化

p <- ggplot(data = my_df) +
  geom_path(aes(x=day_in_july, y = y*50)) +
  geom_scatterpie(aes(x = day_in_july, y = y*50, r = 0.3), 
                  data = my_df, 
                  cols = colnames(my_df)[2:3],
                  color = "red") + 
  geom_text(aes(y = y*50, x = day_in_july, 
                label = paste0(formatC(y*100, digits = 3), "%")),
            nudge_y = 0.07, nudge_x = -0.25, size = 3) +
  geom_text(aes(y = y*50, x = day_in_july, 
            label = paste0(formatC((1-y)*100, digits = 3), "%")),
            nudge_y = -0.07, nudge_x = 0.25, size = 3) +
  scale_fill_manual(values = c("pink", "seagreen3")) +
  scale_x_continuous(labels = xvals, breaks = xvals) +
  scale_y_continuous(name = "yes but no",
                     labels = function(x) x/50) + 
  coord_fixed()

> p

scatterpie

我想在相邻饼图的y值之间添加百分比增加/减少

第一个饼的y值(在day_in_july = 13处)为0.388。从这个y值到下一个饼图的y值(0.425),增加了9.53%。因此,我想用 +9.53%标记连接两个馅饼的线。

最终,我希望剧情看起来像这样:

sp_w_line_annotated

解决方案的途中

This answer已经具有获取我想要的相关机制。 想法是使用ggplot_build()访问绘图下的数据,然后计算两个连续值之间的百分比变化,然后使用相应注释的线重建绘图。但是,此解决方案不适用于散点图,因为从ggplot_build输出的基础数据是同类数据。

plot_data <- ggplot_build(p) %>% ggplot_build(p)$data[[1]] %>% as.tibble()

> plot_data

## # A tibble: 2,904 x 13
##    fill  group   index amount PANEL stringsAsFactors nControl     x     y colour  size linetype alpha
##    <chr> <chr>   <dbl>  <dbl> <fct> <lgl>               <dbl> <dbl> <dbl> <chr>  <dbl>    <dbl> <lgl>
##  1 pink  1     0        0.612 1     FALSE                 221  13    19.7 red      0.5        1 NA   
##  2 pink  1     0.00452  0.612 1     FALSE                 221  13.0  19.7 red      0.5        1 NA   
##  3 pink  1     0.00905  0.612 1     FALSE                 221  13.0  19.7 red      0.5        1 NA   
##  4 pink  1     0.0136   0.612 1     FALSE                 221  13.0  19.7 red      0.5        1 NA   
##  5 pink  1     0.0181   0.612 1     FALSE                 221  13.0  19.7 red      0.5        1 NA   
##  6 pink  1     0.0226   0.612 1     FALSE                 221  13.0  19.7 red      0.5        1 NA   
##  7 pink  1     0.0271   0.612 1     FALSE                 221  13.0  19.7 red      0.5        1 NA   
##  8 pink  1     0.0317   0.612 1     FALSE                 221  13.0  19.7 red      0.5        1 NA   
##  9 pink  1     0.0362   0.612 1     FALSE                 221  13.0  19.7 red      0.5        1 NA   
## 10 pink  1     0.0407   0.612 1     FALSE                 221  13.0  19.7 red      0.5        1 NA   
## # ... with 2,894 more rows

计算饼图y值之间的百分比变化所需的实际y值在哪里?显然,我可以从数据中获取y值。但是为了重建情节,来自ggplot_build()的数据对我来说没有意义,而且我也不知道如何利用该技术将饼图之间的百分比变化添加到情节线上。

1 个答案:

答案 0 :(得分:2)

这是我对ggrepel软件包的尝试。我基本上创建了一个新数据框,其中包含geom_label_repel()的必要信息。我省略了创建foo的详细信息。但我认为您可以阅读。我花了一些时间来找到标签的最佳位置,这就是我现在可以为您做的。如果您对这个职位不满意,就得自己玩。

foo <- tibble(day_in_july = my_df$day_in_july + 0.5,
              y = my_df$y * 50 + (((lead(my_df$y * 50) - (my_df$y * 50))) / 2),
              gap = ((lead(my_df$yes_but_no) / my_df$yes_but_no) - 1) * 100) %>% 
       mutate(gap = paste(round(gap, digits = 2), "%", sep = ""),
              hue = ifelse(gap > 0, "green", "red"))


p <- ggplot(data = my_df) +
     geom_path(aes(x = day_in_july, y = y*50)) +
     geom_scatterpie(aes(x = day_in_july, y = y*50, r = 0.3), 
                     data = my_df, 
                     cols = colnames(my_df)[2:3],
                     color = "red") + 
     geom_text(aes(y = y * 50, x = day_in_july, 
               label = paste0(formatC(y * 100, digits = 3), "%")),
               nudge_y = 0.07, nudge_x = -0.25, size = 3) +
     geom_text(aes(y = y * 50, x = day_in_july, 
               label = paste0(formatC((1-y) * 100, digits = 3), "%")),
               nudge_y = -0.07, nudge_x = 0.25, size = 3) +
     scale_fill_manual(values = c("pink", "seagreen3")) +
     geom_label_repel(data = foo, 
                      aes(x = day_in_july, y = y, 
                      color = hue, label = as.character(gap)),
                      show.legend = FALSE,
                      nudge_x = 0.3,
                      direction = "y",
                      vjust = -1.0) +
     scale_color_manual(values = c("green", "red"))

enter image description here