如何使用reshape2包中的melt()来堆叠数据的分类标签,以生成多个并排的boxpots

时间:2016-04-19 11:00:03

标签: r boxplot reshape2 melt

我正在尝试使用R中for (var i = 0; i < testArray.length; i++) { var d = testArray.shift(); jQuery('.js-target').append(d+'<br>'); } 包中的melt() function来堆叠数据框,同时保留各个观察的分类标签。我的问题是如何调整“reshape2” Code以在行为级别生成多个并排的缺口箱图$ $(2级因子列)按数据的每个行为变量分组 - set behviours(下面提供了Dummy数据的链接)?

我的目标是使用图例为每个家庭Eric Cai's code着色这些多个带凹槽的箱图。但是,在尝试使用(V4=red and W3 = blue)函数排列数据框时遇到尺寸问题,我无法解密。如果有人可以提供帮助,那么请提前多多谢谢。

可重现的虚拟数据位于堆栈溢出页面Reproducible data

的底部
melt()

生成方块图

生成一个名为boxplots.double的对象,它将使用该公式 文本{Mean.value~Family + Behaviors}将图分成12组双峰(即每个行为将在单个图中的行为$ family中分组)。在Eric Cai的代码中,“at =”是一个选项,用于指定沿水平轴的箱形图的位置,而xaxt ='n'用于抑制默认的水平轴,该轴添加了带轴()和标题()的自定义轴

 Here is an example:

 I am trying to follow Eric Cai's instructions
 (1) Stack the data:
     (a) Retain the categorical (2 level factor column) for family [,1]
     (b) Retain all behavioural variables [,2:13]

  #Set vectors for labelling the data

                      behaviours.label=c("Swimming", 
                                         "Not.Swimming",
                                         "Running", 
                                         "Not.Running",
                                         "Fighting",
                                         "Not.Fighting",
                                         "Resting",
                                         "Not.Resting",
                                         "Hunting",
                                         "Not.Hunting",
                                         "Grooming",
                                         "Not.Grooming")

                         family.labels=c("V4", "G8",
                                         "V4", "G8",
                                         "V4", "G8",
                                         "V4", "G8",
                                         "V4", "G8",
                                         "V4", "G8",
                                         "V4", "G8",
                                         "V4", "G8",
                                         "V4", "G8",
                                         "V4", "G8",
                                         "V4", "G8",
                                         "V4", "G8")

    library(tidyr)                        
    data_long <- gather(behaviours, x, Mean.Value, Swimming:Not.Grooming)
    head(data_long)  

    # stack the data while retaining the Family and behavioural variables 

    stacked.data = melt(behaviours, id = c('Family', 'behaviours'))

    # remove the column that gives the column name variable
    stacked.data = stacked.data[, -3]

    #head(stacked.data)
    colnames(stacked.data)<-c("Family", "Behaviours", "Values")

错误消息

   boxplots.double = boxplot(values~Family + Behaviours, 
                             data = stacked.data, 
                             at = c(1:24), 
                             xaxt='n',
                             ylim = c(min(0, min(-3)), 
                             max(7, na.rm = T)),
                             notch=TRUE,
                             col = c("red", "blue"),
                             names = c("V4", "G8"),
                             cex.axis=1.0,
                             srt=45)

  axis(side=1, at=c(1.8, 6.8), labels=c("Swimming", 
                                       "Not.Swimming",
                                       "Running", 
                                       "Not.Running",
                                       "Fighting",
                                       "Not.Fighting",
                                       "Resting",
                                       "Not.Resting",
                                       "Hunting",
                                       "Not.Hunting",
                                       "Grooming",
                                       "Not.Grooming"), line=0.5, lwd=0)

1 个答案:

答案 0 :(得分:1)

在理查德·特尔福德提供帮助后,此代码使用包中包含的Family生成多个并排的箱形图,这些箱形图在分类列(2级)的级别上分组,称为melt() function reshape2

   clear the working directory
   rm(list=ls())

   data(behaviours)

   #Set vectors for labelling the data

   behaviours.labels=c("Swimming",  
                       "Not.Swimming",
                       "Running", 
                       "Not.Running",
                       "Fighting",
                       "Not.Fighting",
                       "Resting",
                       "Not.Resting",
                       "Hunting",
                       "Not.Hunting",
                       "Grooming",
                       "Not.Grooming")

       family.labels=c("V4", "G8",
                       "V4", "G8",
                       "V4", "G8",
                       "V4", "G8",
                       "V4", "G8",
                       "V4", "G8",
                       "V4", "G8",
                       "V4", "G8",
                       "V4", "G8",
                       "V4", "G8",
                       "V4", "G8",
                       "V4", "G8")

      library(tidyr)

      #Structure the data from wide to long format 

      data_long <- gather(behaviours, x, Mean.Value, Swimming:Not.Grooming)
      head(data_long)    

   library(reshape2)

   # stack the data while retaining Family and Values calculated from behaviours[,2:13] using the melt() function

   stacked.data = melt(data_long, id = c('Family', 'x'))
   head(stacked.data)

   # remove the column that gives the column name of the `variable' from all.data

   stacked.data = stacked.data[, -3]
   head(stacked.data)

   #Rename the column headings

   colnames(stacked.data)<-c("Family", "Behaviours", "Values")    

   #Generate the side-by-side boxplots

   windows(height=10, width=14)
   par(mar = c(9, 7, 4, 4)+0.3, mgp=c(5, 1.5, 0))

   boxplots.double = boxplot(Values~Family + Behaviours, 
                             data = stacked.data, 
                             at = c(1:24), 
                             ylim = c(min(0, min(0)), 
                                      max(1.8, na.rm = T)),
                             xaxt = "n",
                             notch=TRUE,
                             col = c("red", "blue"),
                             cex.axis=0.7,
                             cex.labels=0.7,
                             ylab="Values", 
                             xlab="Behaviours",
                             space=1)

   axis(side = 1, at = seq(2, 24, by = 2), labels = FALSE)
   text(seq(2, 24, by=2), par("usr")[3] - 0.2, labels=unique(behaviours.labels), srt = 45, pos = 1, xpd = TRUE, cex=0.8)
   legend("topright", title = "Family", cex=1.0, legend=c("V4" , "G8"), fill=c("Blue", "Red"), lty = c(1,1))

enter image description here