在R中,如何使用for循环来创建一系列图形

时间:2013-10-25 14:20:14

标签: r for-loop boxplot

我有一个宽大的表格,如下所示:

ID  Test_11 LVL11  Score_X_11 Score_Y_11  Test_12 LV12  Score_X_12  Score_Y_12
1   A       I      100        NA          NA      NA    100         100
2   A       II     90         100         B       II    90          85 
3   NA      NA     NA         NA          B       II    90          NA
4   A       III    100        80          A       III   75          75
5   B       I      NA         90          NA      NA    60          50
6   B       I      70         100         NA      NA    NA          NA
7   B       II     85         NA          A       I     60          60

用于排序的表格看起来像这样

Test_11   A
Test_11   B
Test_12   A
Test_12   B

第二个表告诉我们的是,对于Test_11,有两个版本,A和B(对于Test_12也是如此)。

我正在尝试创建一系列箱形图,用于绘制Test_11和Test_12的每个组合及其各自版本(A,B)的分布图。因此,对于Test_11 == A,创建的boxplot将有三个组(I,II,III),然后从子集中得到的图形信息,其中Test_11 == A,然后对于Test_11 == B,Test_12 == A ,和Test_12 == B.在这个例子中,总共应该创建4个图表。

我在R中所拥有的是:

z <- subset(df, df$Test_11=="A")
plot(z$LVL11, z$Score_X_11, varwidth = TRUE, notch = TRUE, xlab = 'LVL', 
     ylab = 'score')

我想要的,并且无法弄清楚如何做,就是编写一个for循环,为我做子集,以便我可以为我的实际数据集自动执行此操作,该数据集有几十个这些组合。

感谢您提供任何帮助和指导。

2 个答案:

答案 0 :(得分:1)

“直截了当”

也许你应该在循环之前将所有逻辑向量保存在data.frame或矩阵中:

selections <- matrix(nrow = nrow(df), ncol = 4)
selections[,1] <- df$Test_11 == "A"
selections[,2] <- df$Test_11 == "B"
selections[,3] <- df$Test_12 == "A"
selections[,4] <- df$Test_12 == "B"
# etc...
par(mfcol = c(2, 2)) # here you should customize at will...
for (i in 1:4) {
  z <- subset(df, selections[,i])
  plot(z$LVL11, z$Score_X_11, varwidth = TRUE, 
       notch = TRUE, xlab = 'LVL', 
       ylab = 'score')
}

您可以更改代码,而不是使用z$Score_X_11,请使用z[,string]string的值应使用paste(或其他字符串操作函数)构造。例如:

v <- c("X", "Y")
n <- c("11", "12")
for (i in 1:2) {
  for (j in 1:2) {
    string <- paste("Score", v[i], n[i], sep = "_")
    print(string)
  }
}

类似的推理将与z$LVLXX值一起使用,因此您应该能够找到适应的方法。

替代方式,ggplot2&amp; reshape2

我对使用格子图形(比如在其他版本中)不是很有经验,但我知道一点ggplot2,所以我决定接受挑战并尝试一下。这不是很好,但至少有效:

# df <- read.table("data.txt", header = TRUE, na.string = "NA")
require(reshape2)
require(ggplot2)

# Melt your data.frame, using the scores as the "values":
mdf <- melt(df[,-1], id = c("LVL11", "LV12", "Test_11", "Test_12"))

# loop through level types:
for (lvl in c("LVL11", "LV12")) {
  # looping through values of test11
  for (test11 in c("A", "B")) {
    # Note the use of subset before ggplot
    p <- ggplot(subset(mdf, Test_11 == test11), aes_string(x=lvl, y="value"))
    # I added the geom_jitter as in the example given there were only a few points
    g <- p + geom_boxplot(aes(fill = variable)) + geom_jitter(aes(shape = variable))
    print(g) # it is necessary to print p explicitly like this in order to use ggplot in a loop
    # Finally, save each plot with a relevant name:
    savePlot(paste0(lvl, "-t11", test11, ".png")) 
    # (note that savePlot has some problems with RStudio iirc)

  }
  # Same as before, but with test_12
  for (test12 in c("A", "B")) {
    p <- ggplot(subset(mdf, Test_12 == test12), aes_string(x=lvl, y="value"))
    g <- p + geom_boxplot(aes(fill = variable)) + geom_jitter(aes(shape = variable))
    print(g) 
    savePlot(paste0(lvl, "-t12", test12, ".png"))
  }
}

如果有人知道如何在这种情况下使用格子图形或facet_grid,那么我可以将所有grahpics放在一张图片中,我很想听听。

欢呼声。

答案 1 :(得分:1)

经典plyr解决方案(HT to @hadleywickham)

require(plyr); require(lattice); require(gridExtra)
bplots <- dlply(dat, .(Test_11, Test_12), function(df){
  bwplot(Score_X_11 ~ LVL11, data = df)
})
do.call('grid.arrange', bplots)

enter image description here