"密度"直方图上的曲线叠加,其中垂直轴是频率(即计数)还是相对频率?

时间:2014-12-22 22:32:05

标签: r ggplot2

当垂直轴是频率或相对频率时,是否有一种方法可以叠加类似于密度曲线的东西? (不是实际的密度函数,因为该区域不需要集成到1.)以下问题类似: ggplot2: histogram with normal curve,用户自我回答,并希望在..count..内扩展geom_density()。然而这似乎不寻常。

以下代码产生过度膨胀"密度"线。

df1            <- data.frame(v = rnorm(164, mean = 9, sd = 1.5))
b1             <- seq(4.5, 12, by = 0.1)
hist.1a        <- ggplot(df1, aes(v)) + 
                    stat_bin(aes(y = ..count..), color = "black", fill = "blue",
                             breaks = b1) + 
                    geom_density(aes(y = ..count..))
hist.1a

plot

3 个答案:

答案 0 :(得分:17)

@joran的回复/评论让我想到了适当的缩放因子是什么。为了后人,这就是结果。

当垂直轴是频率(又名计数)

density

因此,以箱计数测量的垂直轴的比例因子是

bincount

在这种情况下,N = 164和广告区宽度为0.1时,平滑线中y的美学应为:

y = ..density..*(164 * 0.1)

因此,以下代码产生了密度&#34;对频率测量的直方图进行缩放(也称为计数)。

df1            <- data.frame(v = rnorm(164, mean = 9, sd = 1.5))
b1             <- seq(4.5, 12, by = 0.1)
hist.1a        <- ggplot(df1, aes(x = v)) + 
                    geom_histogram(aes(y = ..count..), breaks = b1, 
                                   fill = "blue", color = "black") + 
                    geom_density(aes(y = ..density..*(164*0.1)))
hist.1a

plot

当垂直轴是相对频率时

relfreq

使用上面的内容,我们可以写

hist.1b        <- ggplot(df1, aes(x = v)) + 
                    geom_histogram(aes(y = ..count../164), breaks = b1, 
                                   fill = "blue", color = "black") + 
                    geom_density(aes(y = ..density..*(0.1)))
hist.1b

relf

当垂直轴是密度时

hist.1c        <- ggplot(df1, aes(x = v)) + 
                    geom_histogram(aes(y = ..density..), breaks = b1, 
                                   fill = "blue", color = "black") + 
                    geom_density(aes(y = ..density..))
hist.1c

dens

答案 1 :(得分:4)

请改为尝试:

ggplot(df1,aes(x = v)) + 
   geom_histogram(aes(y = ..ncount..)) + 
   geom_density(aes(y = ..scaled..))

答案 2 :(得分:1)

library(ggplot2)
smoothedHistogram <- function(dat, y, bins=30, xlabel = y, ...){
  gg <- ggplot(dat, aes_string(y)) + 
    geom_histogram(bins=bins, center = 0.5, stat="bin", 
                   fill = I("midnightblue"), color = "#E07102", alpha=0.8) 
  gg_build <- ggplot_build(gg)
  area <- sum(with(gg_build[["data"]][[1]], y*(xmax - xmin)))
  gg <- gg + 
    stat_density(aes(y=..density..*area), 
                 color="#BCBD22", size=2, geom="line", ...)
  gg$layers <- gg$layers[2:1]
  gg + xlab(xlabel) +  
    theme_bw() + theme(axis.title = element_text(size = 16),
                       axis.text = element_text(size = 12))
}

dat <- data.frame(x = rnorm(10000))
smoothedHistogram(dat, "x")

enter image description here