geom_vline中位数与Vector的中位数不一致

时间:2014-09-29 20:25:04

标签: r ggplot2

我有一个载体

var = c(5, 3, 6, 0, 1, 1, 1, 0, 4, 2, 1, 3, 3, 6, 3, 15, 1, 0, 2, 3, 
1, 0, 0, 3, 2, 3, 2, 2, 2, 4, 4, 0, 1, 0, 2, 2, 5, 3, 3, 1, 0, 
1, 1, 6, 4, 3, 0, 7, 4, 2, 3, 3, 0, 1, 1, 3, 4, 5, 2, 1, 3, 10, 
13, 3, 1, 4, 5, 3, 1, 1, 5, 4, 2, 1, 6, 1, 2, 3, 5, 8, 3, 1, 
7, 4, 0, 1, 7, 1, 3, 4, 3, 5, 3, 2, 1, 1, 9, 2, 0, 4, 3, 5)

我正在使用ggplot绘制直方图的分布并绘制中值的垂直线。 var的中位数等于3(使用python numpy检查加倍)

groupMedian <- median(var)

print(groupMedian)

df <- data.table(x = var)

df <- df[, .N, by=x]

df$x <- factor(df$x, levels=c(0:25))

p <- ggplot(df, aes(x=x, y=N)) +
     geom_bar(stat="identity", width=1.0, 
     colour = "darkgreen",
     fill = 'lightslateblue')

p <- p + labs(title = "Var Histogram", 
          x = "x", 
          y = "Frequency") +
   scale_x_discrete(drop=FALSE) +
   geom_vline(xintercept=groupMedian, 
         colour = 'red', size = 2) 

 p = p + coord_cartesian(ylim=c(0, 50)) + 
     scale_y_continuous(breaks=seq(0, 50, 2))

 p = p + theme(panel.grid.major = 
            element_line(colour = "black", linetype = "dotted") )


 ggsave("barplot.png", p, width=8, height=4, dpi=120)

 print(p)

中位数为3,但该线位于2。

我也尝试过使用

p = p+ geom_vline(data=var,
          aes(xintercept = median),
          colour = 'red', size = 2 )

1 个答案:

答案 0 :(得分:2)

您可以让ggplot2为您进行聚合,而只是geom_histogram()。这似乎提供了你所追求的目标:

#load data.table
library(data.table)
df <- data.table(x = var)
groupMedian <- median(var)

ggplot(df, aes(x)) +
  geom_histogram(binwidth = 1,
                 colour = "darkgreen",
                 fill = "lightslateblue",
                 origin = -0.5) + #this effectively centers the x-axis under the bins
  geom_vline(xintercept = groupMedian,
             colour = "red",
             size = 2) +
  scale_x_continuous(breaks = seq(0,25), 
                     limits = c(0,25))

给你这样的东西:

enter image description here