Question

我已经上传了数据资料，并使用以下方法对所有变量进行了快速绘图：

df %>%
  keep(is.numeric) %>% 
  gather() %>% 
  ggplot(aes(value)) +
  facet_wrap(~ key, scales = "free") +
  geom_histogram()

参考：https://drsimonj.svbtle.com/quick-plot-of-all-variables

我已根据其中一列中的二进制变量（以我的情况为Smoker / Non-smoker）将此数据帧分为两个数据帧。我想对所有变量执行相同的快速绘图，但是对每个新数据框都覆盖了不同颜色的直方图（以查看它们是否有显着差异）。

我发现了以下内容：

Overlaying two ggplot facet_wrap histograms

但是它仅对单个变量进行facet_wrap处理。有没有一种方法可以通过用二进制值过滤收集的数据帧，例如：

df %>%
  keep(is.numeric) %>% 
  gather() %>% 
  ggplot(aes(value)) +
  facet_wrap(~ key, scales = "free") +
  geom_histogram(subset(df,Smoker==1), fill = "Red", alpha=0.3) +
  geom_histogram(subset(df,Smoker==2), fill = "Blue", alpha=0.3)

想法将覆盖以下内容：

df_s %>%
  keep(is.numeric) %>% 
  gather() %>% 
  ggplot(aes(value)) +
  facet_wrap(~ key, scales = "free") +
  geom_histogram(fill = "Red", alpha=0.3) 

df_ns %>%
  keep(is.numeric) %>% 
  gather() %>% 
  ggplot(aes(value)) +
  facet_wrap(~ key, scales = "free") +
  geom_histogram(fill = "Blue", alpha=0.3)

我可以这样做，但会在可能的情况下使用df键/值对进行循环。

Answer 1

    df %>% 
      keep(is.numeric) %>% # you may need to remove this as smoker will need to be factor for grouping to work
      tidyr::gather(key,value, -Smoker) %>% #- preserve smoker and use to colour
      ggplot(aes(value, fill = Smoker)) +
      facet_wrap(~ key, scales = "free") +
      geom_histogram(alpha = 0.30) +
      scale_fill_manual(values = c("red","blue"))

叠加两个快速绘图来绘制几个数据帧中的所有变量

1 个答案: