R-使用'stat_compare_means'在ggplot中重新格式化P值

时间:2019-12-27 00:14:27

标签: r ggplot2 ggpubr

我想在多面ggplot中将p值绘制到每个面板。如果p值大于0.05,我想按原样显示p值。如果p值小于0.05,我想以科学计数法显示该值(即0.0032-> 3.20e-3; 0.0000425-> 4.25e-5)。

我为此编写的代码是:

   p1 <- ggplot(data = CD3, aes(location, value, color = factor(location),
                             fill = factor(location))) + 
  theme_bw(base_rect_size = 1) +
  geom_boxplot(alpha = 0.3, size = 1.5, show.legend = FALSE) +
  geom_jitter(width = 0.2, size = 2, show.legend = FALSE) +
  scale_color_manual(values=c("#4cdee6", "#e47267", "#13ec87")) +
  scale_fill_manual(values=c("#4cdee6", "#e47267", "#13ec87")) +
  ylab(expression(paste("Density of clusters, ", mm^{-2}))) +
  xlab(NULL) +
  stat_compare_means(comparisons = list(c("CT", 'N'), c("IF","N")), 
                     aes(label = ifelse(..p.format.. < 0.05, formatC(..p.format.., format = "e", digits = 2),
                                        ..p.format..)), 
                     method = 'wilcox.test', show.legend = FALSE, size = 10) +
  #ylab(expression(paste('Density, /', mm^2, )))+
  theme(axis.text = element_text(size = 10), 
        axis.title = element_text(size = 20), 
        legend.text = element_text(size = 38), 
        legend.title = element_text(size = 40), 
        strip.background = element_rect(colour="black", fill="white", size = 2),
        strip.text = element_text(margin = margin(10, 10, 10, 10), size = 40),
        panel.grid = element_line(size = 1.5))
plot(p1)

此代码运行无错误,但是数字的格式未更改。我究竟做错了什么? enter image description here 我附加了数据以重现该图:donwload data here

编辑

structure(list(value = c(0.931966449207829, 3.24210526315789, 
3.88811650210901, 0.626860993574675, 4.62085308056872, 0.477508650519031, 
0.111900110501359, 3.2495164410058, 4.06626506024096, 0.21684918139434, 
1.10365086026018, 4.66666666666667, 0.174109967855698, 0.597625869832174, 
2.3758865248227, 0.360751947840548, 1.00441501103753, 3.65168539325843
), Criteria = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Density", "Density of cluster", 
"nodular count", "Elongated count"), class = "factor"), Case = structure(c(1L, 
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 
6L), .Label = c("Case 1A", "Case 1B", "Case 2", "Case 3", "Case 4", 
"Case 5"), class = "factor"), Mark = structure(c(1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("CD3", 
"CD4", "CD8", "CD20", "FoxP3"), class = "factor"), location = structure(c(3L, 
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
2L), .Label = c("CT", "IF", "N"), class = "factor")), row.names = c(91L, 
92L, 93L, 106L, 107L, 108L, 121L, 122L, 123L, 136L, 137L, 138L, 
151L, 152L, 153L, 166L, 167L, 168L), class = "data.frame")

1 个答案:

答案 0 :(得分:1)

我认为您的问题来自stat_compare_meanscomparisons的使用。 我不太确定,但是我会猜测stat_compare_means的p值的输出与compare_means不同,因此,您不能将其用于{{ 1}}。

让我用您的示例解释一下,您可以像这样修改p.value的显示:

aes

enter image description here

您可以正确显示p.value,但丢失了柱线。因此,如果使用label参数,则会得到:

library(ggplot2)
library(ggpubr)
ggplot(df, aes(x = location, y = value, color = location))+
  geom_boxplot()+
  stat_compare_means(ref.group = "N", aes(label = ifelse(p < 0.05,sprintf("p = %2.1e", as.numeric(..p.format..)), ..p.format..)))

enter image description here

因此,现在您得到的是条形,但显示不正确。

要解决此问题,您可以使用comparisons函数在ggplot2之外执行统计,并使用软件包library(ggplot2) library(ggpubr) ggplot(df, aes(x = location, y = value, color = location))+ geom_boxplot()+ stat_compare_means(comparisons = list(c("CT","N"), c("IF","N")), aes(label = ifelse(p < 0.05,sprintf("p = %2.1e", as.numeric(..p.format..)), ..p.format..))) 来显示正确的显示。

在这里,我正在使用compare_means和函数ggsignif创建新列,但是您可以在dplyr R中轻松地做到这一点。

mutate

然后,您可以绘制它:

base

enter image description here

它看起来像您要绘制的内容吗?