如何创建显示均值和标准差的箱形图以及点的散点图

时间:2017-08-23 21:24:37

标签: r matlab stata scatter-plot boxplot

我试图通过修改下面的代码来创建一个显示平均值和标准差的箱形图以及软件R中点的散点图。

library(ggplot2)

# create fictitious data
a <- runif(10)
b <- runif(12)
c <- runif(7)
d <- runif(15)

# data groups
group <- factor(rep(1:4, c(10, 12, 7, 15)))

# dataframe
mydata <- data.frame(c(a,b,c,d), group)
names(mydata) <- c("value", "group")

# function for computing mean, DS, max and min values
min.mean.sd.max <- function(x) {
  r <- c(min(x), mean(x) - sd(x), mean(x), mean(x) + sd(x), max(x))
  names(r) <- c("ymin", "lower", "middle", "upper", "ymax")
  r
}

# ggplot code
p1 <- ggplot(aes(y = value, x = factor(group)), data = mydata)
p1 <- p1 + stat_summary(fun.data = min.mean.sd.max, geom = "boxplot") + geom_jitter(position=position_jitter(width=.2), size=3) + ggtitle("Boxplot con 
   media, 95%CI, valore min. e max.") + xlab("Gruppi") + ylab("Valori")

这是我的数据集:

number  name    percent
1   CD1_lung1   0.824214533
3   CD1_lung2   1.118706494
5   CD1_lung3   1.271139637
7   CD1_lung4   0.785939335
9   CNR_20  0.592576592
11  CNR_lung    1.764129689
13  CNR_2   0.643293719
2   Gpc_KO1_lung    0.730014957
4   Gpc_KO2_lung    0.679556429
6   Gpc_KO3_lung    1.00910329
8   KO12    1.074708817
10  Gpc1_hom_lung   1.86280637
12  KO35    0.521546931
14  KO45    0.486304707

我使用read.table将其加载到R中(&#34; C:/Users/me/Desktop/WB0823_m1/wb0823R.txt" ;, header = TRUE);然而,由于我对使用R非常新,我一直坚持下一步该做什么。如果有更简单的方法在MATLAB或Stata中创建这个箱图,我会喜欢也请知道!我无法弄清楚这两个软件中的任何一个。

1 个答案:

答案 0 :(得分:2)

您只需要将数据读入R并确定要分组的个人,然后其余代码应该有效。要编辑图表,ggplot2有一个有用的网站:http://ggplot2.tidyverse.org/reference/。这应该让你开始:

library(ggplot2)

#read data into R (I just pasted your data into a blank text file)
mydata <- read.table("~/Desktop/tmp.txt",header=T) 

#add a column to group observations (I guessed here)
mydata$group <- c(1,1,1,1,2,2,2,3,3,3,4,3,4,4)

# function for computing mean, DS, max and min values
min.mean.sd.max <- function(x) {
  r <- c(min(x), mean(x) - sd(x), mean(x), mean(x) + sd(x), max(x))
  names(r) <- c("ymin", "lower", "middle", "upper", "ymax")
  r
}

# ggplot code 
ggplot(aes(y = percent, x = factor(group)), data = mydata)+
  ggtitle("Boxplot con media, 95%CI, valore min. e max.")+xlab("Gruppi")+ylab("Valori")+
  stat_summary(fun.data = min.mean.sd.max, geom = "boxplot")+
  geom_jitter(position=position_jitter(width=.2), size=3)