在同一ggplot上绘制不同列的数据

时间:2018-10-14 02:52:35

标签: r ggplot2

我想显示的数据在某列中的值大于0.05时基本上使用“开放”形状,而在某列中的值小于0.05时使用“封闭”形状。

我的直觉是在两列中具有相同的一组值,但在两个变量的每个副本中都缺少某些值,这样我可以为每一列使用geom_point()但形状不同时(打开和关闭),所有数据都会出现,但要按照我上面指定的规则。我还要在ggplot2中做其他事情,例如实现按结果分组的结构,这两个列我都希望进行。

这可能是融化数字的一种方法,但是如果是这种情况,那么我将不知道如何实现(可能)所需的条件。

请参见下面的示例,了解我的尝试。

library(ggplot2)

df <- data.frame(
    outcome = c("Outcome 1","Outcome 2", "Outcome 3", "Outcome 1","Outcome 2", "Outcome 3", "Outcome 1","Outcome 2", "Outcome 3"),
    sample = c("Indiana", "Indiana", "Indiana", "Colorado", "Colorado", "Colorado", "Virginia", "Virginia", "Virginia"),
    pvals_open = c(0.095, 0.120, 0.420, NA, 0.192, 0.121, NA, 0.22, 0.30),
    pvals_closed = c(NA, NA, NA, 0.029, NA, NA, 0.043, NA, NA)
)

pd <- position_dodge(0.8)

picture <- ggplot(df, aes(x = outcome, y = pvals_open, group = sample, colour = sample)) +
    geom_point(aes(shape = sample), size = 2, alpha = 1, position = pd) +
    # Use geom_point to make points look open
    geom_point(aes(shape = sample), size = 1, alpha = 1, position = pd, color = "white") +
    # Would like to incorporate points from pvals_closed
    geom_point(data = df, aes(x = outcome, y = pvals_closed, group = sample, colour = sample)) +
    # Doesn't quite work. For Outcome 1, black circle should be a black square that is slightly above 
    # orange triangle (but not directly so), and green cicle should be below (but not directly so)
    # Three colors for Indiana, Colorado, and Virginia. Would like this to hold for both sets of pval
    scale_colour_manual(values = c('#91D699', '#F95A36', '#000000')) +  
    # Other features I'd like to include
    coord_flip(ylim = c(0,1)) + 
    theme(
    legend.justification=c(0, 1),
    legend.position = "none",
    legend.title = element_blank(),
    legend.background = element_rect(fill = NA),
    text = element_text(size=11),
    panel.background = element_rect(fill = NA),
    panel.border = element_rect(fill = NA, color = 'grey75'),
    axis.ticks = element_blank(),
    plot.title = element_text(size=14),
    panel.grid.major.x = element_blank(),
    panel.grid.minor.x = element_blank())

这是我拍的照片:

Here is the picture I've made.

如果有人有任何解决方案/指导,我将非常感谢。谢谢!

2 个答案:

答案 0 :(得分:2)

library(ggplot2)

data.frame(
  outcome = c("Outcome 1","Outcome 2", "Outcome 3", "Outcome 1","Outcome 2", "Outcome 3", "Outcome 1","Outcome 2", "Outcome 3"),
  sample = c("Indiana", "Indiana", "Indiana", "Colorado", "Colorado", "Colorado", "Virginia", "Virginia", "Virginia"),
  pvals = c(0.095, 0.120, 0.420, 0.029, 0.192, 0.121, 0.043, 0.22, 0.30),
  stringsAsFactors = FALSE
) -> xdf

# use this for the shape factor control

xdf$shape <- ifelse(xdf$pvals >= 0.05, "open", "closed")
xdf$shape <- sprintf("%s-%s", xdf$sample, xdf$shape)

# we'll use position_nudge() but we need to specify the values manually
# since position can't be mapped to an aesthetic. You could (fairly easily)
# use some logic to programmatically set the values here vs the hard
# coding that I did (have to leave some work for the OP ;-)

ggplot(xdf) +
  geom_point(
    aes(pvals, outcome, group = sample, colour = sample, shape = shape),
    position = position_nudge(y = c(0, 0, 0, 0, -0.1, -0.1, 0.1, 0.1, 0.1)),
    size = 3, stroke=1
  ) +
  scale_x_continuous(limits=c(0,1)) +
  scale_color_manual(
    name = NULL,
    values = c(
      "Indiana" = "#F95A36",
      "Colorado" = "#91D699",
      "Virginia" = "#000000"
    )
  ) +
  scale_shape_manual( # here's how we get shape aeshetic mapping
    name = NULL,
    values = c(
      "Indiana-open" = 24, 
      "Indiana-closed" = 17, 
      "Colorado-open" = 1,
      "Colorado-closed" = 19, 
      "Virginia-open" = 0, 
      "Virginia-closed" = 15
    ),
    labels = c( # you have no legend for what the points actually mean but just in case you decide to do that, here are better labels for ^^
      "Colorado (p>=0.5)",
      "Colorado (p<0.5)",
      "Indiana (p>=0.5)",
      "Indiana (p<0.5)",
      "Virginia (p>=0.5)", 
      "Virginia (p<0.5)"
    )
  ) +
  theme(
    text = element_text(size = 11),
    axis.ticks = element_blank(),
    panel.background = element_rect(fill = NA),
    panel.border = element_rect(fill = NA, color = 'grey75'),
    panel.grid.major.x = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.justification = c(0, 1),
    legend.position = "none",
    legend.title = element_blank(),
    legend.background = element_rect(fill = NA)
  )

enter image description here

答案 1 :(得分:2)

我在这里可能会丢失一些东西,但似乎您是通过颜色和形状来表示样本,而没有通过同时使用它们来添加任何多余的东西。也许可以通过仅使用颜色来表示样本,然后使用形状来表示点是否大于/小于0.05来简化此过程。甚至更容易地,您只需在0.05位置添加一行,就可以很容易地分辨出哪些行超出/不足。

library(dplyr)
library(ggplot2)

df <- data_frame(
  outcome = c("Outcome 1","Outcome 2", "Outcome 3", "Outcome 1","Outcome 2", "Outcome 3", "Outcome 1","Outcome 2", "Outcome 3"),
  sample = c("Indiana", "Indiana", "Indiana", "Colorado", "Colorado", "Colorado", "Virginia", "Virginia", "Virginia"),
  pvals_open = c(0.095, 0.120, 0.420, NA, 0.192, 0.121, NA, 0.22, 0.30),
  pvals_closed = c(NA, NA, NA, 0.029, NA, NA, 0.043, NA, NA)
)

df2 <- df %>% 
  mutate(
    val = coalesce(pvals_open, pvals_closed),
    sig = if_else(val > 0.05, "> 0.05", "<= 0.05")
  ) %>% 
  select(outcome, sample, val, sig)

ggplot(df2) +
  aes(x = outcome, y = val, group = sample, colour = sample, shape = sig) +
  geom_point(size = 2, position = position_dodge(0.8)) +
  geom_hline(yintercept = 0.05, linetype = "dotted") +
  coord_flip(ylim = c(0,1)) +
  theme(
    # legend.justification=c(0, 1),
    # legend.position = "none",
    # legend.title = element_blank(),
    # legend.background = element_rect(fill = NA),
    text = element_text(size=11),
    panel.background = element_rect(fill = NA),
    panel.border = element_rect(fill = NA, color = 'grey75'),
    axis.ticks = element_blank(),
    plot.title = element_text(size=14),
    panel.grid.major.x = element_blank(),
    panel.grid.minor.x = element_blank())

reprex package(v0.2.0)于2018-10-14创建。