更多'自动'方式表示条形图中使用R的统计上显着的差异?

时间:2014-05-20 09:36:40

标签: r

我希望有人可以帮助我,因为我对R和堆栈溢出相当新。

我正在尝试使用R创建一组条形图,显示处理和未处理样本的差异的p值。我找到了另外两个与我相似的帖子(Indicating the statistically significant difference in bar graph USING RIndicating the statistically significant difference in bar graph )。

然而,我想知道是否有更“自动化”的方式来适当地放置标签和线条以指示图中的统计显着性,如前一篇文章所述:Indicating the statistically significant difference in bar graph USING R?虽然手动执行此操作确实会制作一些漂亮的图表但却非常耗时。

非常感谢!

示例数据(抱歉,不知道如何上传以便从.csv导入):

Time,Dose,Variable,n,Mean,SD,Median,Upper.SEM,Lower.SEM
1,0,P,3,20.1341,1.049791,20,0.5728394,0.5569923
1,1,P,3,22.79528,1.110182,21.64,1.4179833,1.334943
6,0,P,3,38.63702,1.042969,37.74,0.9499892,0.9271918
6,1,P,3,24.25966,1.156925,23.82,2.1300073,1.9580866
24,0,P,3,42.3231,1.073583,43.75,1.7710033,1.6998725
24,1,P,3,13.78995,1.170568,13.15,1.3126463,1.1985573
48,0,P,3,36.01035,1.208213,35.63,4.1551262,3.7252776
48,1,P,3,23.3236,1.4403,20.65,5.4688355,4.4300848



g<- qplot(x=factor(Time), y=Mean, fill=factor(Dose),
      data=ExData, geom="bar", stat="identity",
      position="dodge")+ geom_errorbar(aes(ymax=Mean+Upper.SEM,
                                           ymin=Mean-Lower.SEM
      ),
      position=position_dodge(0.9),
      data=ExData, width=0.5)
g<-g+  xlab("Time (hrs)") 
g<-g+  ylab("Concentration (pmol/uL)") 
g<-g+ coord_cartesian(ylim=c(0, 50)) + scale_y_continuous(breaks=seq(0, 50, 5))
g<-g+ guides(fill=guide_legend(title="Dose (uM)"))
g<-g+ scale_fill_manual(values=c("red","blue"))
g<-g+ theme_bw()
g<-g+ theme(plot.title = element_text(face="bold", size=20))
g<-g+ theme(axis.title.x = element_text(face="bold", size=20))
g<-g+ theme(axis.title.y = element_text(face="bold", size=20))
g<-g+ theme(axis.text.x=element_text(face="bold",colour='black', size=20))
g<-g+ theme(axis.text.y=element_text(face="bold",colour='black', size=20))
g<-g+theme(axis.text=element_text(face="bold", size=20))
# Legend Title and label appearance
g<- g+theme(legend.title = element_text(colour="black", size=20, face="bold"))
g<- g + theme(legend.text = element_text(colour="black", size = 20, face = "bold"))
### Line for p-value 1uM vs 0uM at 1hr
g<-g+ annotate("text",x=1,y=27,label="p=0.1289")
g<- g+ annotate("segment", x = 0.8, xend = 0.8, y = 25, yend = 26,colour = "black")
g<- g+ annotate("segment", x = 1.2, xend = 1.2, y = 25, yend = 26,colour = "black")
g<- g+ annotate("segment", x = 0.8, xend = 1.2, y = 26, yend = 26, colour = "black")
### Line for p-value 1uM vs 0uM at 6hr
g<-g+ annotate("text",x=2,y=42,label="p=0.0063")
g<- g+ annotate("segment", x = 1.8, xend = 1.8, y = 40, yend = 41, colour = "black")
g<- g+ annotate("segment", x = 2.2, xend = 2.2, y = 40, yend = 41, colour = "black")
g<- g+ annotate("segment", x = 1.8, xend = 2.2, y = 41, yend = 41,colour = "black")
### Line for p-value 1uM vs 0uM at 24hr
g<-g+ annotate("text",x=3,y=47,label="p=0.0004")
g<- g+ annotate("segment", x = 2.8, xend = 2.8, y = 45, yend = 46,colour = "black")
g<- g+ annotate("segment", x = 3.2, xend = 3.2, y = 45, yend = 46, colour = "black")
g<- g+ annotate("segment", x = 2.8, xend = 3.2, y = 46, yend = 46,colour = "black")
### Line for p-value 1uM vs 0uM at 48hr
g<-g+ annotate("text",x=4,y=43,label="p=0.1670")
g<- g+ annotate("segment", x = 3.8, xend = 3.8, y = 41, yend = 42,colour = "black")
g<- g+ annotate("segment", x = 4.2, xend = 4.2, y = 41, yend = 42,colour = "black")
g<- g+ annotate("segment", x = 3.8, xend = 4.2, y = 42, yend = 42,colour = "black")
g

(抱歉,我不会让我上传图表的图片)

1 个答案:

答案 0 :(得分:0)

对我来说,问题的关键在于为图形的两个条形棒覆盖(即在上方)找到可行的/正确的高度。 OP代码中的所有其他部分都是精简的。低于半工作的解决方案(很快就会为我提供夜间睡眠),但基础知识还有其他评论和改进建议。

从剪贴板中读取数据,然后根据每剂量和时间组合的可用数据计算条形高度。

dat <- read.table("clipboard", sep="\t", header=TRUE)
library(plyr)
dat <- ddply(dat, .(Time), 
             function(d.f) {
               A <- subset(d.f, Dose==0)
               B <- subset(d.f, Dose==1)
               m <- max(A$Mean+A$Upper.SEM, B$Mean+B$Upper.SEM)
               d.f$bar.h <- round(m) + 5
               return(d.f)
             })

设置基本情节(忘记抛光)

library(ggplot2)
p <- ggplot(data=dat, mapping=aes(x=factor(Time), y=Mean, fill=factor(Dose))) +
  geom_bar(stat='identity', position='dodge') + 
  geom_errorbar(aes(ymin=Mean-Lower.SEM, ymax=Mean+Upper.SEM), 
                position=position_dodge(0.9), width=0.5)

## Instead of using annotate (which is just as fine), use geom_segment because
## the data is directly in the existing data.frame
p + geom_segment(mapping=aes(x=0.8, xend=1.2, y=bar.h, yend=bar.h))

不要忘记在aes()的电话中添加geom_segment()

现在当你运行代码时,你会(事后看来很明显)在第一对条形图上面看到三行,因为我将段的X坐标固定为Time所有级别的相同值。

要改进此功能,您需要执行的下一步是向数据框添加另一组列,以指定每个条形/段的x坐标,并相应地更新geom_segment()代码。为简单起见,假设您想要为每个图形创建具有相同数量的条形图数的多个图形,最简单的方法是手动执行此操作。

根据这一假设进行推断,当你以不知情的方式反复进行配对比较时,请注意capitalization of chance(但这只是我猜测的结果;你没有指明p-的位置价值来自)。