我试图从单个数据框中绘制几个有序(即,从高到低中位数)的条件箱图。一般顺序如下:
我希望使用上述过程循环使用大约70个变量,但是从tapply
移动到aggregate
,访问数据帧中的每个变量,并对循环序列进行编码。请在下面的R代码中为缺乏优雅而道歉:
bpdf = data.frame(group=c("A","A","A","B","B","B","C","C","C"),
x=c(1,1,2,2,3,3,3,4,4),
y=c(7,5,2,9,7,6,3,1,2),
z=c(4,5,2,9,8,9,7,6,7))
sorted.medians = rev(sort(with(bpdf,tapply(bpdf$x,bpdf$group,median))))
boxplot(bpdf$x~factor(bpdf$group,levels=names(sorted.medians)))
答案 0 :(得分:2)
我想,您只需要将{2}行放在lapply
:
lapply(bpdf[,-1],function(x){
## decreasing better than rev here
y <- sort(tapply(x,bpdf$group,median),decreasing=TRUE)
boxplot(x~factor(bpdf$group,levels=names(y)))
})
编辑来绘制变量名称,使用箱形图的main
参数,然后循环使用bpdf
的同事:
lapply(colnames(bpdf[,-1]),function(i){
## decreasing better than rev here
x <- bpdf[,i]
title <- paste0('title',i) ## you can change it here
y <- sort(tapply(x,bpdf$group,median),decreasing=TRUE)
boxplot(x~factor(bpdf$group,levels=names(y)),main=title)
})
答案 1 :(得分:1)
如果我正确理解了这个问题,我认为以下应该做你想做的事:
加载几个包并创建一些数据:
library(plyr)
library(reshape2)
dd = data.frame(group=c("A","B","C", "D"),
x1=runif(40),x2=runif(40),x3=runif(40),x4=runif(40))
现在计算变量和组的中位数条件
dd_m = melt(dd, "group")
meds = ddply(dd_m, c("variable", "group"), summarise, m = median(value))
按变量和中位数排序数据框:
sorted_meds = meds[with(meds, order(variable, -m)), ]
查看变量,依次对每个数据框进行排序:
for(var in unique(sorted_meds$variable)){
grp_order = sorted_meds[sorted_meds$variable==var, ]$group
dd_tmp = dd_m[dd_m$variable==var,]
dd_tmp$group = factor(dd_tmp$group, levels = grp_order)
boxplot(dd_tmp$value ~ dd_tmp$group)
}