如何在ggplot2的stat_summary_bin中进行装箱?

时间:2018-09-09 19:05:00

标签: r ggplot2

我正在尝试使用ggplot2向bin-散点图添加一些自定义功能。我进行bin分散的原始方法是使用stat_summary_bin(fun.y="mean")。这似乎可以产生合理的分档,但是当我尝试通过手动分档来重现它时,我会得到略有不同的结果-尤其是在右尾。

有人可以帮助我弄清楚stat_summary_bin中的合并是如何完成的吗?我需要弄清楚这是否是我可以使用的可靠的bin散射形式...

library(tidyverse)
library(mltools)
#> 
#> Attaching package: 'mltools'
#> The following object is masked from 'package:tidyr':
#> 
#>     replace_na
x = runif(1000, 0, 10)
y = x + rnorm(1000, 0.5, 2)
plot(x,y)

df <- data.frame(x = x, y = y)

p <- df %>%
  ggplot(aes(x = x, y = y)) +
  stat_summary_bin(aes(color ="stat summary"),fun.y = "mean", size = 2.5, geom="point", bins=20)
p

## Attempt 1 at binning
df$x_bin <-  mltools::bin_data(df$x, bins=20, binType = "explicit")
df_binned <- df %>%
  group_by(x_bin) %>%
  mutate(
    x_binned = mean(x),
    y_binned = mean(y)
  ) %>%
  ungroup()

p <- p + geom_point(aes(x = df_binned$x_binned, y = df_binned$y_binned, color = "manual bin"), size = 2.5)
p

## Attempt 2 at binning
xbreaks = quantile(df$x, probs = seq(0,1,0.05))
df_binned$x_bin_2 <- cut(df$x, xbreaks, include.lowest = T)
df_binned <- df_binned %>%
  group_by(x_bin_2) %>%
  mutate(
    x_binned2 = mean(x),
    y_binned2 = mean(y)
  ) %>%
  ungroup()

p <- p + geom_point(aes(x = df_binned$x_binned2, y = df_binned$y_binned2, color = "2nd manual bin"), size = 2.5)
p

reprex package(v0.2.0)于2018-09-09创建。

0 个答案:

没有答案