将小提琴图与内部箱图对齐时出现问题

时间:2020-05-18 01:42:07

标签: r ggplot2 boxplot violin-plot

我有一个熔化的数据帧df,其中第一列sample names,第二列Group,第三列Genes,第四列Expression (logCPM)

head(df)

sample names    Group   Genes   Expression (logCPM)
Sample1        GroupA   Gene1   3.45
Sample2        GroupA   Gene1   2.34
Sample3        GroupA   Gene1   0.5667
Sample4        GroupA   Gene1   1.98
Sample5        GroupA   Gene1   0.45
Sample6        GroupB   Gene1   4.566
Sample7        GroupB   Gene1   0.5667

我正在尝试将箱形图与以下代码结合起来制作小提琴图:

positions <- c("GroupA", "GroupB")
e <- ggplot(df, aes(x = Genes, y = Expression (logCPM)))
e2 <-  e + geom_violin(
  aes(color = Group), trim = FALSE,
  position = position_dodge(0.9), draw_quantiles=c(0.5)) +
  geom_boxplot(
    aes(color = Group), width = 0.01,
    position = position_dodge(0.9)) +
  scale_color_manual(legend_title, values = c("GroupA"="#FC4E07", "GroupB"="#00AFBB")) +
  theme_bw(base_size = 14) + xlab("") + ylab("Expression (logCPM)") +
  theme(axis.text=element_text(size=15, face = "bold", color = "black"),
        axis.title=element_text(size=15, face = "bold", color = "black"),
        strip.text = element_text(size=15, face = "bold", color = "black"),
        axis.text.x = element_text(angle = 0),
        legend.text=element_text(size=12, face = "bold", color = "black"),
        legend.title=element_text(size=15,face = "bold", color = "black"))
e2

enter image description here

我正在尝试用每个小提琴图中的箱形图创建小提琴图。但这看起来并不好。它看起来不像小提琴,而是线条。我需要校正什么内容吗?

我正在使用的数据量很大

1 个答案:

答案 0 :(得分:0)

我不得不将您的前三个样本复制到GroupB中,以弥补样本量少的问题。这是您要找的吗?

library(tidyverse)
df <- tribble(~"sample names",~Group,~Genes,~"Expression (logCPM)",
              "Sample1","GroupA","Gene1",3.45,
              "Sample2","GroupA","Gene1",2.34,
              "Sample3","GroupA","Gene1",0.5667,
              "Sample4","GroupA","Gene1",1.98,
              "Sample5","GroupA","Gene1",0.45,
              "Sample6","GroupB","Gene1",4.566,
              "Sample7","GroupB","Gene1",0.5667,
              "Sample8","GroupB","Gene1",3.45, # extra, copied from Sample 1
              "Sample9","GroupB","Gene1",2.34, # extra, copied from Sample 2
              "Sample10","GroupB","Gene1",0.5667) # extra, copied from Sample 3

ggplot(df, aes(x = Genes, y = `Expression (logCPM)`,group = Group, fill = Group)) + # I prefer to store all the aes() in the first ggplot() layer so that the remaining layers can just be about customising the plot
  geom_violin(trim = FALSE,alpha = 0.5, draw_quantiles=c(0.5),position = position_dodge(1)) +
  geom_boxplot(width = 0.1,position = position_dodge(1)) +
  theme_bw() # + other theme settings

enter image description here