如何在函数中包装均值和CI图

时间:2018-07-30 10:43:48

标签: r function ggplot2 functional-programming

我有以下summarySE生成的数据集,它显示了tsex组之间的均值和置信区间。

mn.bmd <- structure(list(sex = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 
                                     2L, 2L, 2L), .Label = c("female", "male"), class = "factor"), 
                   t = c(10L, 12L, 14L, 16L, 18L, 10L, 12L, 14L, 16L, 18L), 
                   N = c(2731L, 2750L, 2607L, 2524L, 2397L, 2427L, 2452L, 2374L, 
                         2343L, 1935L), bmd = c(0.771745743658987, 0.852563274643638, 
                                                0.959264663475704, 1.00448137517321, 1.03961818701633, 0.78197475849084, 
                                                0.84601311310275, 0.953283665154095, 1.0561553454168, 1.14395286996851
                         ), sd = c(0.0546859583968217, 0.0728002055433497, 0.0765731777406101, 
                                   0.0729628520321917, 0.0752411677480204, 0.0524685598606996, 
                                   0.060935438701901, 0.085630182993752, 0.0964219075622181, 
                                   0.100009937518834), se = c(0.00104644155540708, 0.00138824544949947, 
                                                              0.00149970608925882, 0.00145230263867668, 0.00153681471482534, 
                                                              0.00106503592133958, 0.00123057959424098, 0.00175746431217515, 
                                                              0.00199200110406779, 0.00227354037595468), ci = c(0.00205189747680689, 
                                                                                                                0.00272210959875271, 0.00294073574524029, 0.00284782704999121, 
                                                                                                                0.00301362384271727, 0.00208847400752308, 0.00241308331525491, 
                                                                                                                0.0034463245617893, 0.003906269195061, 0.00445884772686761
                                                              )), class = "data.frame", row.names = c("1", "2", "3", "4", 
                                                                                                      "5", "6", "7", "8", "9", "10"), .Names = c("sex", "t", "N", "bmd", 
                                                                                                                                                 "sd", "se", "ci"))

我可以使用以下代码按t组来绘制均值和置信区间:

ggplot(mn.bmd, aes(x=t, y=bmd, colour=sex)) + 
geom_errorbar(aes(ymin=bmd-ci, ymax=bmd+ci), size=0.3, width=.3) + 
geom_line() + geom_point(size=3, shape=21)

我想将此ggplot代码包装在一个函数中,以便针对不同的数据帧(所有结构都相同,但y具有不同的列名)重复它-我尝试使用{{1} },但没有运气吗?

aes_string

2 个答案:

答案 0 :(得分:1)

我认为,只要所有数据框具有相同的列名,只需将ggplot调用包装在函数中即可。

如果数据框的名称不同,则必须使用ggplot()函数将名称作为字符串传递给get()。因此,例如,代替

ggplot(x,aes(x=t))

您将拥有

ggplot(x,aes(x=get(colname_x)))

其中colname_x是一个字符串,包含要作为ggplot()传递给x的列的名称

修改

针对OP的评论: 我将列名添加到函数的参数中,并在对ggplot()的调用中添加get()语句,就像这样

my_plot <- function(df,colname_y) {
  ggplot(df, aes(x=t, y=get(colname_y), colour=sex)) +
  geom_errorbar(aes(ymin=bmd-ci, ymax=bmd+ci), size=0.3, width=.3) +
  geom_line() + geom_point(size=3, shape=21)
}

答案 1 :(得分:1)

如果所有数据帧都以相同的方式形成,这应该可以工作:

library(dplyr)

my_plot <- function(df, y) {
  ymin <- df[[y]] - df$ci
  ymax <- df[[y]] + df$ci
  ggplot(df, aes_string(x="t", y=y, colour="sex")) + 
    geom_errorbar(aes(ymin=ymin, ymax=ymax), size=0.3, width=.3) + 
    geom_line() + 
    geom_point(size=3, shape=21)
}

# you can replace mn.bmd with other data frames and check the result
my_plot(df = mn.bmd, y = "bmd")