ggplot中每组的平均值

时间:2019-06-12 01:52:13

标签: r ggplot2

我正在尝试创建一个点图,该点图在一个方向上是离散的,而在另一个方向上是连续的。然后,我想显示每个离散值的平均值。

这是我到目前为止获得的最接近的数字:

library(tibble)
library(dplyr)
library(stringr)
library(ggplot2)

mtcars_with_brand <- mtcars %>%
  ungroup() %>%
  rownames_to_column("Car") %>%
  mutate(Brand = word(Car, 1,1, sep = " ")) %>%
  mutate(Brand = ifelse(Brand %in% c('Fiat','Toyota','Hornet', 'Merc'), Brand, 'zOther')) %>%
  mutate(Brand=reorder(Brand, mpg, mean))

mean_mpg <- mtcars_with_brand %>%
  group_by(Brand) %>%
  mutate(mean_mpg = mean(mpg, na.rm = TRUE)) %>%
  ungroup() %>%
  select(Brand, mean_mpg) %>%
  distinct()

mtcars_with_brand %>%
  ggplot(aes(x = Brand, y = mpg)) +
  geom_col(data = mean_mpg, 
           aes(x = Brand,
               y = mean_mpg),
           col = "black",
           fill = "white") +
  geom_point(height = 0) +
  geom_vline(xintercept=seq(from=0.5, to=5.5, by=1), colour='#bbbbbb') +
  coord_flip() +
  theme_classic()

enter image description here

但是我真的很希望在y值上显示一条线,而不是条形图给我的整个轮廓。

感觉好像我正在尝试使用错误的geom,但是我不确定应该怎么做。我研究了geom_linerange和类似的内容,但是如果他们适合这样做,我不知道怎么做。

1 个答案:

答案 0 :(得分:1)

您既可以使用geom_point绘制图形,也可以使用group_bysummarise得出的均值图形。您似乎对样式元素掌握得很好,所以我将其排除在外,因此解决方案很明确:

ggplot() +
  # Points for each car
  geom_point(data = mtcars_with_brand, mapping = aes(y = Brand, x = mpg)) +
  # Vertical bars for the means
  geom_point(data = mtcars_with_brand %>% 
      # Group the data by brand then get means
      group_by(Brand) %>% 
      summarise(mean_mpg = mean(mpg)), 
    # Specify aesthetics
    mapping = aes(y = Brand, x = mean_mpg), 
    size = 10, color = 'red', shape = '|') 

enter image description here