以特定形式重塑数据

时间:2016-10-08 09:37:22

标签: r reshape data-munging

我的数据如下,它是一个但实际上我只有experiment,它是简化的数据集:

DF=structure(list(theoric = c("E", "E", "F", "F", "F"), observed = c("E", 
"E", "F", "F", "E"), experiment = c("RO(2)", "RO(2)", "RO(2)", "RO(2)", 
"RO(2)")), .Names = c("theoric", "observed", "experiment"), row.names = 2:6, class = "data.frame")

现在我的数据有以下形式:

  theoric observed  experiment
2       E        E RO(2)
3       E        E RO(2)
4       F        F RO(2)
5       F        F RO(2)
6       F        E RO(2)

我想让它重新塑造如下:

                  2 3 4 5 6
RO(2) theoric     E E F F F
RO(2) observed    E E F F E

最简单的方法是什么?我真的不知道该怎么做。我试过了

meltR <- melt(DF, id="experiment")

但我失去了theoricobserved之间的所有对应关系。非常感谢

编辑:完整数据集:

DF=structure(list(theoric = c("E", "E", "F", "F", "F", "E", "F", 
"F", "F", "F", "F", "E", "E", "E", "E"), observed = c("E", "E", 
"F", "F", "E", "F", "F", "F", "F", "F", "F", "E", "E", "E", "F"
), experiment = c("RO", "RO", "RO", "RO", "RO", "MO", "MO", "MO", 
"MO", "MO", "MO", "EL", "EL", "EL", "EL")), .Names = c("theoric", 
"observed", "experiment"), row.names = c(2L, 3L, 4L, 5L, 6L, 
24L, 25L, 26L, 27L, 28L, 29L, 21L, 22L, 23L, 13L), class = "data.frame")

输出:

    col2 col1.2 col1.3 col1.4 col1.5 col1.6 col1.24 col1.25 col1.26
1   RO theoric      E      E      F      F      F    <NA>    <NA>    <NA>
6   MO theoric   <NA>   <NA>   <NA>   <NA>   <NA>       E       F       F
12  EL theoric   <NA>   <NA>   <NA>   <NA>   <NA>    <NA>    <NA>    <NA>
16 RO observed      E      E      F      F      E    <NA>    <NA>    <NA>
21 MO observed   <NA>   <NA>   <NA>   <NA>   <NA>       F       F       F
27 EL observed   <NA>   <NA>   <NA>   <NA>   <NA>    <NA>    <NA>    <NA>
   col1.27 col1.28 col1.29 col1.21 col1.22 col1.23 col1.13
1     <NA>    <NA>    <NA>    <NA>    <NA>    <NA>    <NA>
6        F       F       F    <NA>    <NA>    <NA>    <NA>
12    <NA>    <NA>    <NA>       E       E       E       E
16    <NA>    <NA>    <NA>    <NA>    <NA>    <NA>    <NA>
21       F       F       F    <NA>    <NA>    <NA>    <NA>
27    <NA>    <NA>    <NA>       E       E       E       F

编辑2:添加EL输出

RO theoric     E E F F F
RO observed    E E F F E
MO theoric     E F F F F
MO observed    F F F F F
EL theoric     E E E E
EL observed    E E E F

2 个答案:

答案 0 :(得分:3)

根据预期的输出,我们可能需要创建一个row.names的列。通过unlist前两列创建新数据集('df2'),复制'experiment'列和rownames列。然后使用reshape中的base R将'long'格式转换为'wide'。

df2 <- data.frame(col1 = unlist(DF[1:2], use.names=FALSE), 
      col2 = paste( rep(DF$experiment, 2),
    rep(colnames(DF)[1:2], each = nrow(DF))), col3 = rep(row.names(DF), 2))

reshape(df2, idvar = "col2", direction="wide", timevar = "col3")
#             col2 col1.2 col1.3 col1.4 col1.5 col1.6
#1  RO(2) theoric      E      E      F      F      F
#6 RO(2) observed      E      E      F      F      E

或使用melt/dcast中的data.table。将'data.frame'转换为'data.table',将行名称setDT(DF, keep.row.names = TRUE)),melt保持为'long'格式,paste'实验'和'变量'列,然后dcast从'long'到'wide'格式。

library(data.table)
dcast(melt(setDT(DF, keep.rownames = TRUE), id.var = c("rn", "experiment"))[,
    experiment := paste(experiment, variable)], experiment~rn, value.var = "value")
#       experiment 2 3 4 5 6
#1: RO(2) observed E E F F E
#2:  RO(2) theoric E E F F F

更新

使用新数据集

library(data.table)#v1.9.7+
dcast(melt(setDT(DF), id.var = "experiment"), paste(experiment, 
    variable)~rowid(experiment, variable), value.var="value", fill="")
#    experiment 1 2 3 4 5 6
#1: EL observed E E E F    
#2:  EL theoric E E E E    
#3: MO observed F F F F F F
#4:  MO theoric E F F F F F
#5: RO observed E E F F E  
#6:  RO theoric E E F F F  

答案 1 :(得分:1)

您还可以执行以下操作:

require(tidyverse)                                                                                                                                                                                                                  
DF %>% 
  gather(type, val, theoric, observed) %>% 
  unite(experiment, experiment, type, sep=" ") %>% 
  group_by(experiment) %>% 
  mutate(experiment_number = 1:n()) %>% 
  spread(experiment_number, val, fill="")

这给了你:

   experiment   `1`   `2`   `3`   `4`   `5`   `6`
*       <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 EL observed     E     E     E     F            
2  EL theoric     E     E     E     E            
3 MO observed     F     F     F     F     F     F
4  MO theoric     E     F     F     F     F     F
5 RO observed     E     E     F     F     E      
6  RO theoric     E     E     F     F     F