ggplot2直方图binwidth

时间:2014-10-09 11:25:17

标签: r ggplot2 histogram

我想在一个图中创建多个直方图(使用facet_wrap)。 这可能是一个示例代码:

df <- data.frame(p1 = rnorm(100,5,2), p2 = rnorm(100,80,20), group = rep(LETTERS[1:4],25))

library(ggplot2)
library(reshape)

plotData <- melt(df, id.vars = "group", measure.vars = c("p1","p2")  )

m <- ggplot(plotData, aes(x = value, color = group, fill = group, group = group))
m <- m + geom_bar(position=position_dodge())
m <- m + facet_wrap( ~ variable,scales = "free_x")
print(m)

现在,我想修改它为每个参数创建的图表(“p1,”p2“),让我们说10个箱子。

到目前为止,我找不到办法做到这一点,因为binwidth / break计算应该依赖于数据子集。

有可能吗?


我想分享我的解决方案(取自上面连接的已回答的问题)扩展的可能性,将直方图与缩放直方图计数的密度曲线重叠:

df <- data.frame(p1 = rnorm(1000,5,2), p2 = rnorm(1000,80,20), group = rep(LETTERS[1:4],25))

library(ggplot2)
library(reshape)
library(plyr)

plotData <- melt(df, id.vars = "group", measure.vars = c("p1","p2")  )

nBins <- 10

groupedData <- dlply(plotData, .(variable))
groupedBinWidth <- llply(groupedData, .fun = function(data, nBins) {
  r <- range(data$value, na.rm = TRUE, finite = TRUE)
  widthOfBins = (r[2] - r[1])/nBins
  if (is.na(widthOfBins) || is.infinite(widthOfBins) || (widthOfBins <= 0)) widthOfBins <- NULL
  widthOfBins
}, nBins = nBins)

densData <- dlply(plotData, .(variable, group), .fun = function(subData){
  param <- subData$variable[1]
  group <- subData$group[1]
  d <- density(subData$value)
  bw <- groupedBinWidth[[param]]
  data.frame(x = d$x, y = d$y * nrow(subData) * bw , group = group, variable = param)
})

hls <- mapply(function(x, b) geom_bar(aes(x = value), position = position_dodge(), data = x, binwidth = b), 
              groupedData, groupedBinWidth)

dLay <- mapply(function(data) geom_density(data = data, aes(x = x, y = y), stat = "identity", fill = NA, size = 1), 
               densData)

m <- ggplot(plotData, aes(x = value, color = group, fill = group, group = group))
m <- m + hls
m <- m + dLay
m <- m + facet_wrap( ~ variable,scales = "free")
print(m) 

enter image description here

1 个答案:

答案 0 :(得分:0)

试试这个 - 非常丑陋的代码,但如果我理解正确的话,它会起作用。您可能希望使用geom_density,并可能删除填充以使其更具可读性。

nbin<- 5
m <- ggplot(plotData, aes(x = value, color = group, fill = group, group = group))
m <- m + geom_histogram(data = subset(plotData, variable == "p1"), binwidth=diff(range(subset(plotData, variable == "p1")$value))/nbin)
m <- m + geom_histogram(data = subset(plotData, variable == "p2"),  binwidth=diff(range(subset(plotData, variable == "p2")$value))/nbin)
m <- m + facet_wrap( ~ variable,scales = "free_x")
print(m)

enter image description here