具有因子水平的对角图作为对角线

时间:2014-04-08 14:34:14

标签: r ggplot2 ggally

我有这个太拥挤而无用的情节:

ggplot(data = meandist.SG, aes(x = starttime,y = meandist)) +  #set main plot variables
    geom_ribbon(aes(ymin=meandist-se, ymax=meandist+se, fill=mapped), alpha=0.1) + #add standard error 
    geom_line(aes(colour = mapped),alpha = 1) +  #add a line for each group 
    labs(title = "Comparison of Groups", x = "time (s)", y = "mean distance (mm)") #set title, and axis labels

unreadable

我可以通过在mlply中包含以下内容并传入可能的组对来为每对组创建一个图。但这意味着我无法同时轻松看到所有情节。

ggplot(data = subset(meandist.SG, mapped %in% c('a', 'f')) ,aes(x = starttime,y = meandist)) +  #set main plot variables
    geom_ribbon(aes(ymin=meandist-se, ymax=meandist+se, fill=mapped), alpha=0.1) + #add standard error to main plot
    geom_line(aes(colour = mapped),alpha = 1,size = 1) + #plot a line on main plot for each group
    labs(title = 'GroupA and GroupB, Distance over Time', x = "time (s)", y = "mean distance (mm)")

a single pair

我想要做的是创建一个单独的图像,其中配对的组图像排列成一个以mapped因子为对角线的对图。

数据如下所示:

> str(meandist.SG)
'data.frame':   2400 obs. of  4 variables:
 $ starttime: num  0 0 0 0 0 0 0 0 60 60 ...
 $ mapped   : Factor w/ 8 levels "rowA","rowB",..: 1 2 3 4 5 6 7 8 1 2 ...
 $ meandist : num  123.2 115 91.9 112.8 108.6 ...
 $ se       : num  8.95 9.54 9.57 9.86 11.96 ...

> head(meandist.SG)
  starttime mapped meandist        se
1         0   rowA 123.1739  8.952757
2         0   rowB 114.9875  9.544961
3         0   rowC  91.8875  9.571005
4         0   rowD 112.7583  9.861424
5         0   rowE 108.5826 11.962127
6         0   rowF 126.4917  9.331622

我认为我应该使用GGally包,但我无法弄清楚如何使用因子的水平作为对角线。想法?

1 个答案:

答案 0 :(得分:2)

如果我理解正确,这是一个使用facet的解决方案。我不得不生成一个演示数据集,因为你的样本还不够。

library(ggplot2)
library(data.table)
library(plyr)
# this generates the demo dataset - you have this already
set.seed(1)
df <- do.call(rbind,lapply(1:8,function(i){
  data.frame(starttime=seq(0,20000,100),
        mapped=LETTERS[i],
        meandist=100*i+rnorm(201,0,20),
        se=50)
}))
# you start here...
dt=data.table(df)
setnames(dt,c("starttime","mapped","meandist","se"),c("x","H","y.H","se.H"))
setkey(dt,x)
gg <- dt[,list(V=H,y.V=y.H,se.V=se.H),key="x"]
gg <- dt[gg, allow.cartesian=T]
ggp <- ggplot(gg,aes(x=x))
ggp <- ggp + geom_line(aes(y=y.H, color=H))
ggp <- ggp + geom_line(subset=.(H!=V), aes(y=y.V, color=V))
ggp <- ggp + geom_ribbon(aes(ymin=y.H-se.H, ymax=y.H+se.H, fill=H), alpha=0.1)
ggp <- ggp + geom_ribbon(aes(ymin=y.V-se.V, ymax=y.V+se.V, fill=V), alpha=0.1)
ggp <- ggp + facet_grid(V~H, scales="free")
ggp <- ggp + guides(fill=guide_legend("mapped"),color=guide_legend("mapped"))
ggp <- ggp + theme(axis.text.x=element_text(angle=-90,vjust=.2, hjust=0))
ggp <- ggp + labs(x="Start Time",y="Mean Distance")
print(ggp)

这为每对组(“映射”)创建了meandiststarttime的分面成对图。请注意,每个绘图(对角线上方和下方)都有两个副本。

这种方法基本上创建了两个数据集副本,并在x变量(starttime)上进行笛卡尔连接。我使用数据表,因为连接效率更高,代码更紧凑。为方便起见,我重新命名了列。