在两个ggplot直方图上显示平均值和中位数

时间:2020-02-10 19:11:38

标签: r

我是stackoverflow的新用户,目前无法在原始帖子中发表评论以提出问题。我遇到了先前的stackoverflow答案(https://stackoverflow.com/a/34045068/11799491),我想知道如何在此处向此图添加两条垂直线(组的平均值和组的中位数)。

enter image description here

我的尝试:我不知道如何在组变量“ type”中添加

geom_vline(aes(xintercept = mean(diff), ), color="black") + 
geom_vline(aes(xintercept = median(diff), ), color="red") 

2 个答案:

答案 0 :(得分:4)

有几种不同的方法可以执行此操作,但是我喜欢创建一个单独的汇总数据框,然后将其传递给geom_vline调用。这使您可以分析结果,并轻松添加多行,这些行将根据类型自动排序和着色:

library(tidyverse) 

df <-
  tibble(
    x = rnorm(40),
    category = rep(c(0, 1), each = 20)
  )

df_stats <-
  df %>% 
  group_by(category) %>% 
  summarize(
    mean = mean(x), 
    median = median(x)
  ) %>% 
  gather(key = key, value = value, mean:median)

df %>% 
  ggplot(aes(x = x)) +
  geom_histogram(bins = 20) +
  facet_wrap(~ category) +
  geom_vline(data = df_stats, aes(xintercept = value, color = key))

enter image description here

答案 1 :(得分:1)

最简单的方法是按type组预先计算均值和中位数。我将使用aggregate

agg <- aggregate(diff ~ type, data, function(x) {
  c(mean = mean(x), median = median(x))
})
agg <- cbind(agg[1], agg[[2]])
agg <- reshape2::melt(agg, id.vars = "type")

library(ggplot2)

ggplot(data, aes(x = diff)) +
  geom_histogram() +
  geom_vline(data = agg, mapping = aes(xintercept = value,
                                       color = variable)) +
  facet_grid(~type) +
  theme_bw()

enter image description here