我可以使用geom_text或类似方法按总和而不是修改原始数据框来添加组吗?

时间:2017-06-22 06:19:50

标签: r ggplot2

这是一个条形图:

ggplot(filtered_funnel, aes(x = reorder(Funnel, -Sessions), y = Sessions)) +
      geom_bar(stat = "identity", fill = "#008080", alpha = 0.6) +
      xlab("Step") +
      ylab("Events") +
      scale_y_continuous(labels = function(l) {l = l / 1000; paste0(l, "K")}) +
      geom_text(aes(label = Sessions, group = Channel), color = "white")

看起来像这样(注意白色的文字标签): enter image description here

这是因为数据filtered_funnel实际上是由字段"频道"分开的。我需要条形图基于例如的分组总和。会议而不是每个频道会议。

以下是源数据的一瞥:

> glimpse(filtered_funnel)
Observations: 108
Variables: 4
$ Channel  <chr> "Direct", "Direct", "Direct", "Direct", "Direct", "Direct", "Direct", "Direct", "Direct", "Direct", "Direct", "Direct", ...
$ Promo    <chr> "apples", "apples", "apples", "banannas", "banannas", "banannas", "carrots", "carrots", "carrots", "none", "none", "none...
$ Funnel   <chr> "Checkout", "ShippingDetails", "Transactions", "Checkout", "ShippingDetails", "Transactions", "Checkout", "ShippingDetai...
$ Sessions <dbl> 3993, 6332, 2224, 1237, 1962, 689, 2234, 3543, 1244, 42378, 4672, 28120, 87187, 7408, 2602, 611, 969, 340, 4462, 7280, 2...


filtered_funnel
Source: local data frame [108 x 4]
Groups: Channel, Promo [?]

   Channel    Promo          Funnel Sessions
     <chr>    <chr>           <chr>    <dbl>
1   Direct   apples        Checkout     3993
2   Direct   apples ShippingDetails     6332
3   Direct   apples    Transactions     2224
4   Direct banannas        Checkout     1237
5   Direct banannas ShippingDetails     1962
6   Direct banannas    Transactions      689
7   Direct  carrots        Checkout     2234
8   Direct  carrots ShippingDetails     3543
9   Direct  carrots    Transactions     1244
10  Direct     none       AddToCart    42378
# ... with 98 more rows

看起来ggplot正在添加每个组件的各个值而不是总和,例如仅适用于图像中的第一个条(会话步骤)

> filtered_funnel %>% filter(Funnel == "Sessions")
Source: local data frame [6 x 4]
Groups: Channel, Promo [6]

   Channel Promo   Funnel Sessions
     <chr> <chr>    <chr>    <dbl>
1   Direct  none Sessions    87187
2    Email  none Sessions   110035
3 Facebook  none Sessions    79734
4  Organic  none Sessions    80768
5      SEM  none Sessions    94610
6  Youtube  none Sessions    66681

我可以在图像和表格中看到值110035。我真正希望ggplot做的是添加会话的总和。

由于这是在Shiny应用程序中完成的,因此我尝试避免使用源数据,因为我使用过滤器框输入来过滤数据帧。此外,我看到其他SO答案,似乎暗示我可能是什么,我只是无法让它工作,例如

如何让ggplot添加每个栏的总和?我可以使用ggplot分组和求和而不是改变我在aes中输入ggplot的源数据吗?

----数据输入--- 在评论之后,这里是数据(无论如何都是随机生成的,因此没有NDA问题)

> dput(filtered_funnel)
structure(list(Channel = c("Direct", "Direct", "Direct", "Direct", 
"Direct", "Direct", "Direct", "Direct", "Direct", "Direct", "Direct", 
"Direct", "Direct", "Direct", "Direct", "Direct", "Direct", "Direct", 
"Email", "Email", "Email", "Email", "Email", "Email", "Email", 
"Email", "Email", "Email", "Email", "Email", "Email", "Email", 
"Email", "Email", "Email", "Email", "Facebook", "Facebook", "Facebook", 
"Facebook", "Facebook", "Facebook", "Facebook", "Facebook", "Facebook", 
"Facebook", "Facebook", "Facebook", "Facebook", "Facebook", "Facebook", 
"Facebook", "Facebook", "Facebook", "Organic", "Organic", "Organic", 
"Organic", "Organic", "Organic", "Organic", "Organic", "Organic", 
"Organic", "Organic", "Organic", "Organic", "Organic", "Organic", 
"Organic", "Organic", "Organic", "SEM", "SEM", "SEM", "SEM", 
"SEM", "SEM", "SEM", "SEM", "SEM", "SEM", "SEM", "SEM", "SEM", 
"SEM", "SEM", "SEM", "SEM", "SEM", "Youtube", "Youtube", "Youtube", 
"Youtube", "Youtube", "Youtube", "Youtube", "Youtube", "Youtube", 
"Youtube", "Youtube", "Youtube", "Youtube", "Youtube", "Youtube", 
"Youtube", "Youtube", "Youtube"), Promo = c("apples", "apples", 
"apples", "banannas", "banannas", "banannas", "carrots", "carrots", 
"carrots", "none", "none", "none", "none", "none", "none", "pears", 
"pears", "pears", "apples", "apples", "apples", "banannas", "banannas", 
"banannas", "carrots", "carrots", "carrots", "none", "none", 
"none", "none", "none", "none", "pears", "pears", "pears", "apples", 
"apples", "apples", "banannas", "banannas", "banannas", "carrots", 
"carrots", "carrots", "none", "none", "none", "none", "none", 
"none", "pears", "pears", "pears", "apples", "apples", "apples", 
"banannas", "banannas", "banannas", "carrots", "carrots", "carrots", 
"none", "none", "none", "none", "none", "none", "pears", "pears", 
"pears", "apples", "apples", "apples", "banannas", "banannas", 
"banannas", "carrots", "carrots", "carrots", "none", "none", 
"none", "none", "none", "none", "pears", "pears", "pears", "apples", 
"apples", "apples", "banannas", "banannas", "banannas", "carrots", 
"carrots", "carrots", "none", "none", "none", "none", "none", 
"none", "pears", "pears", "pears"), Funnel = c("Checkout", "ShippingDetails", 
"Transactions", "Checkout", "ShippingDetails", "Transactions", 
"Checkout", "ShippingDetails", "Transactions", "AddToCart", "Checkout", 
"Registrations", "Sessions", "ShippingDetails", "Transactions", 
"Checkout", "ShippingDetails", "Transactions", "Checkout", "ShippingDetails", 
"Transactions", "Checkout", "ShippingDetails", "Transactions", 
"Checkout", "ShippingDetails", "Transactions", "AddToCart", "Checkout", 
"Registrations", "Sessions", "ShippingDetails", "Transactions", 
"Checkout", "ShippingDetails", "Transactions", "Checkout", "ShippingDetails", 
"Transactions", "Checkout", "ShippingDetails", "Transactions", 
"Checkout", "ShippingDetails", "Transactions", "AddToCart", "Checkout", 
"Registrations", "Sessions", "ShippingDetails", "Transactions", 
"Checkout", "ShippingDetails", "Transactions", "Checkout", "ShippingDetails", 
"Transactions", "Checkout", "ShippingDetails", "Transactions", 
"Checkout", "ShippingDetails", "Transactions", "AddToCart", "Checkout", 
"Registrations", "Sessions", "ShippingDetails", "Transactions", 
"Checkout", "ShippingDetails", "Transactions", "Checkout", "ShippingDetails", 
"Transactions", "Checkout", "ShippingDetails", "Transactions", 
"Checkout", "ShippingDetails", "Transactions", "AddToCart", "Checkout", 
"Registrations", "Sessions", "ShippingDetails", "Transactions", 
"Checkout", "ShippingDetails", "Transactions", "Checkout", "ShippingDetails", 
"Transactions", "Checkout", "ShippingDetails", "Transactions", 
"Checkout", "ShippingDetails", "Transactions", "AddToCart", "Checkout", 
"Registrations", "Sessions", "ShippingDetails", "Transactions", 
"Checkout", "ShippingDetails", "Transactions"), Sessions = c(3993, 
6332, 2224, 1237, 1962, 689, 2234, 3543, 1244, 42378, 4672, 28120, 
87187, 7408, 2602, 611, 969, 340, 4462, 7280, 2304, 549, 896, 
283, 2094, 3417, 1081, 42251, 5666, 29094, 110035, 9244, 2926, 
256, 418, 132, 129, 191, 85, 3078, 4557, 2039, 120, 178, 79, 
13977, 90, 9727, 79734, 134, 59, 1142, 1691, 756, 3125, 4655, 
1985, 1724, 2568, 1095, 3109, 4631, 1975, 34756, 2864, 23453, 
80768, 4266, 1819, 249, 371, 158, 1839, 2661, 1223, 1543, 2232, 
1026, 2007, 2904, 1335, 24090, 1792, 15272, 94610, 2593, 1192, 
479, 693, 318, 800, 1245, 522, 1734, 2698, 1132, 930, 1447, 607, 
22349, 1436, 14478, 66681, 2235, 937, 1579, 2457, 1031)), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -108L), drop = TRUE, .Names = c("Channel", "Promo", 
"Funnel", "Sessions"), indices = list(0:2, 3:5, 6:8, 9:14, 15:17, 
    18:20, 21:23, 24:26, 27:32, 33:35, 36:38, 39:41, 42:44, 45:50, 
    51:53, 54:56, 57:59, 60:62, 63:68, 69:71, 72:74, 75:77, 78:80, 
    81:86, 87:89, 90:92, 93:95, 96:98, 99:104, 105:107), group_sizes = c(3L, 
3L, 3L, 6L, 3L, 3L, 3L, 3L, 6L, 3L, 3L, 3L, 3L, 6L, 3L, 3L, 3L, 
3L, 6L, 3L, 3L, 3L, 3L, 6L, 3L, 3L, 3L, 3L, 6L, 3L), biggest_group_size = 6L, labels = structure(list(
    Channel = c("Direct", "Direct", "Direct", "Direct", "Direct", 
    "Email", "Email", "Email", "Email", "Email", "Facebook", 
    "Facebook", "Facebook", "Facebook", "Facebook", "Organic", 
    "Organic", "Organic", "Organic", "Organic", "SEM", "SEM", 
    "SEM", "SEM", "SEM", "Youtube", "Youtube", "Youtube", "Youtube", 
    "Youtube"), Promo = c("apples", "banannas", "carrots", "none", 
    "pears", "apples", "banannas", "carrots", "none", "pears", 
    "apples", "banannas", "carrots", "none", "pears", "apples", 
    "banannas", "carrots", "none", "pears", "apples", "banannas", 
    "carrots", "none", "pears", "apples", "banannas", "carrots", 
    "none", "pears")), class = "data.frame", row.names = c(NA, 
-30L), drop = TRUE, .Names = c("Channel", 
"Promo")))

1 个答案:

答案 0 :(得分:4)

您可以使用stat_summary来计算总和,并将其用于y位置和标签:

ggplot(filtered_funnel, aes(x = reorder(Funnel, -Sessions), y = Sessions)) +
  geom_bar(stat = "identity", fill = "#008080", alpha = 0.6) +
  xlab("Step") +
  ylab("Events") +
  scale_y_continuous(labels = function(l) {l = l / 1000; paste0(l, "K")})  +
  stat_summary(aes(label = ..y..), fun.y = 'sum', geom = 'text', col = 'white', vjust = 1.5)

enter image description here