当垂直轴是频率或相对频率时,是否有一种方法可以叠加类似于密度曲线的东西? (不是实际的密度函数,因为该区域不需要集成到1.)以下问题类似:
ggplot2: histogram with normal curve,用户自我回答,并希望在..count..
内扩展geom_density()
。然而这似乎不寻常。
以下代码产生过度膨胀"密度"线。
df1 <- data.frame(v = rnorm(164, mean = 9, sd = 1.5))
b1 <- seq(4.5, 12, by = 0.1)
hist.1a <- ggplot(df1, aes(v)) +
stat_bin(aes(y = ..count..), color = "black", fill = "blue",
breaks = b1) +
geom_density(aes(y = ..count..))
hist.1a
答案 0 :(得分:17)
@joran的回复/评论让我想到了适当的缩放因子是什么。为了后人,这就是结果。
当垂直轴是频率(又名计数)
时
因此,以箱计数测量的垂直轴的比例因子是
在这种情况下,N = 164
和广告区宽度为0.1
时,平滑线中y的美学应为:
y = ..density..*(164 * 0.1)
因此,以下代码产生了密度&#34;对频率测量的直方图进行缩放(也称为计数)。
df1 <- data.frame(v = rnorm(164, mean = 9, sd = 1.5))
b1 <- seq(4.5, 12, by = 0.1)
hist.1a <- ggplot(df1, aes(x = v)) +
geom_histogram(aes(y = ..count..), breaks = b1,
fill = "blue", color = "black") +
geom_density(aes(y = ..density..*(164*0.1)))
hist.1a
当垂直轴是相对频率时
使用上面的内容,我们可以写
hist.1b <- ggplot(df1, aes(x = v)) +
geom_histogram(aes(y = ..count../164), breaks = b1,
fill = "blue", color = "black") +
geom_density(aes(y = ..density..*(0.1)))
hist.1b
当垂直轴是密度时
hist.1c <- ggplot(df1, aes(x = v)) +
geom_histogram(aes(y = ..density..), breaks = b1,
fill = "blue", color = "black") +
geom_density(aes(y = ..density..))
hist.1c
答案 1 :(得分:4)
请改为尝试:
ggplot(df1,aes(x = v)) +
geom_histogram(aes(y = ..ncount..)) +
geom_density(aes(y = ..scaled..))
答案 2 :(得分:1)
library(ggplot2)
smoothedHistogram <- function(dat, y, bins=30, xlabel = y, ...){
gg <- ggplot(dat, aes_string(y)) +
geom_histogram(bins=bins, center = 0.5, stat="bin",
fill = I("midnightblue"), color = "#E07102", alpha=0.8)
gg_build <- ggplot_build(gg)
area <- sum(with(gg_build[["data"]][[1]], y*(xmax - xmin)))
gg <- gg +
stat_density(aes(y=..density..*area),
color="#BCBD22", size=2, geom="line", ...)
gg$layers <- gg$layers[2:1]
gg + xlab(xlabel) +
theme_bw() + theme(axis.title = element_text(size = 16),
axis.text = element_text(size = 12))
}
dat <- data.frame(x = rnorm(10000))
smoothedHistogram(dat, "x")