在R中制作特定的分位数图

时间:2017-10-08 08:17:54

标签: r ggplot2 visualization data-visualization

我对following visulization (Decile term)

非常感兴趣

enter image description here

我想知道如何在R中做到这一点。

当然有直方图和密度图,但它们没有做出如此好的可视化。特别是,我想知道是否可以使用ggplot / tidyverse进行此操作。

编辑以回复评论 library(dplyr) library(ggplot2) someData <- data_frame(x = rnorm(1000)) ggplot(someData, aes(x = x)) + geom_histogram()  这会产生一个直方图(见http://www.r-fiddle.org/#/fiddle?id=LQXazwMY&version=1

但我怎么能得到coloful酒吧?如何实现小矩形? (箭头不太相关)。

1 个答案:

答案 0 :(得分:6)

您必须定义多个中断,并使用与这些直方图中断匹配的近似十分位数。否则,两个十分位将在一个栏中结束。

d <- data_frame(x = rnorm(1000))

breaks <- seq(min(d$x), max(d$x), length.out = 50)
quantiles <- quantile(d$x, seq(0, 1, 0.1))
quantiles2 <- sapply(quantiles, function(x) breaks[which.min(abs(x - breaks))])

d$bar <- as.numeric(as.character(cut(d$x, breaks, na.omit((breaks + dplyr::lag(breaks)) / 2))))
d$fill <- cut(d$x, quantiles2, na.omit((quantiles2 + dplyr::lag(quantiles2)) / 2))

ggplot(d, aes(bar, y = 1, fill = fill)) +
  geom_col(position = 'stack', col = 1, show.legend = FALSE, width = diff(breaks)[1])

enter image description here

或者使用更鲜明的颜色:

ggplot(d, aes(bar, y = 1, fill = fill)) +
  geom_col(position = 'stack', col = 1, show.legend = FALSE, width = diff(breaks)[1]) +
  scale_fill_brewer(type = 'qual', palette = 3) # The only qual pallete with enough colors

enter image description here

添加一些样式并将中断增加到100:

ggplot(d, aes(bar, y = 1, fill = fill)) +
  geom_col(position = 'stack', col = 1, show.legend = FALSE, width = diff(breaks)[1], size = 0.3) +
  scale_fill_brewer(type = 'qual', palette = 3) +
  theme_classic() +
  coord_fixed(diff(breaks)[1], expand = FALSE) + # makes square blocks
  labs(x = 'x', y = 'count')

enter image description here

这是最后一个功能:

decile_histogram <- function(data, var, n_breaks = 100) {
  breaks <- seq(min(data[[var]]), max(data[[var]]), length.out = n_breaks)
  quantiles <- quantile(data[[var]], seq(0, 1, 0.1))
  quantiles2 <- sapply(quantiles, function(x) breaks[which.min(abs(x - breaks))])

  data$bar <- as.numeric(as.character(
    cut(data[[var]], breaks, na.omit((breaks + dplyr::lag(breaks)) / 2)))
  )
  data$fill <- cut(data[[var]], quantiles2, na.omit((quantiles2 + dplyr::lag(quantiles2)) / 2))

  ggplot2::ggplot(data, ggplot2::aes(bar, y = 1, fill = fill)) +
    ggplot2::geom_col(position = 'stack', col = 1, show.legend = FALSE, width = diff(breaks)[1], size = 0.3) +
    ggplot2::scale_fill_brewer(type = 'qual', palette = 3) +
    ggplot2::theme_classic() +
    ggplot2::coord_fixed(diff(breaks)[1], expand = FALSE) +
    ggplot2::labs(x = 'x', y = 'count')
}

用作:

d <- data.frame(x = rnorm(1000))
decile_histogram(d, 'x')