嵌套重塑从宽到长

时间:2017-08-31 13:04:19

标签: r reshape

在尝试将对象重塑为长方向时,我会继续收到所有类型的错误消息。玩具数据:

d <- structure(c(0.204, 0.036, 0.015, 0.013, 0.208, 0.037, 0.015, 
0.006, 0.186, 0.044, 0.016, 0.023, 0.251, 0.044, 0.02, 0.01, 
0.268, 0.04, 0.007, 0.007, 0.208, 0.062, 0.027, 0.036, 0.272, 
0.054, 0.006, 0.01, 0.274, 0.05, 0.011, 0.006, 0.28, 0.039, 0.007, 
0.019, 1.93, 0.345, 0.087, 0.094, 2.007, 0.341, 0.064, 0.061, 
1.733, 0.39, 0.131, 0.201, 0.094, 0.01, 0.004, 0, 0.096, 0.014, 
0, 0.001, 0.081, 0.016, 0.002, 0.016, 0.062, 0.007, 0.011, 0.001, 
0.07, 0.003, 0.005, 0.002, 0.043, 0.033, 0, 0.007, 0.081, 0.039, 
0.007, 0, 0.085, 0.033, 0.008, 0, 0.086, 0.023, 0.007, 0.007, 
0.083, 0.015, 0, 0, 0.09, 0.009, 0, 0, 0.049, 0.052, 0, 0.025, 
2.779, 0.203, 0.098, 0.016, 2.801, 0.242, 0.135, 0.01, 2.12, 
0.466, 0.177, 0.121, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 0, 1, 
2, 3, 0, 1, 2, 3, 0, 1, 2, 3), .Dim = c(12L, 11L), .Dimnames = list(
    c("0", "1", "2", "3", "0", "1", "2", "3", "0", "1", "2", 
    "3"), c("age_77", "age_78", "age_79", "age_80", "age_81", 
    "age_82", "age_83", "age_84", "age_85", "item", "k")))

基本上我有不同的年龄,报告了3个项目,每个项目有4个响应类别。我想获得一个colnames = age, item, k, proportion的长形物体,如下所示:

  structure(c(77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 78,
78, 78, 78, 78, 78, 78, 78, 78, 78, 78, 78, 1, 1, 1, 1, 2, 2, 
2, 2, 3, 3, 3, 3, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 0, 1, 2, 
3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 
0.204, 0.036, 0.015, 0.013, 0.208, 0.037, 0.015, 0.006, 0.186, 
0.044, 0.016, 0.023, 0.251, 0.044, 0.02, 0.01, 0.268, 0.04, 0.007, 
0.007, 0.208, 0.062, 0.027, 0.036), .Dim = c(24L, 4L), .Dimnames = list(
    c("0", "1", "2", "3", "0", "1", "2", "3", "0", "1", "2", 
    "3", "0", "1", "2", "3", "0", "1", "2", "3", "0", "1", "2", 
    "3"), c("age", "item", "k", "proportion")))

我试过的一个例子:

reshape(as.data.frame(d), varying =1:9, sep = "_", direction = "long",
                        times = "k", idvar = "item")
Error in `row.names<-.data.frame`(`*tmp*`, value = paste(ids, times[i],  : 

不允许重复'row.names'

任何线索哪里是我的错?非常感谢!

1 个答案:

答案 0 :(得分:2)

OP提供的对象d不是data.frame,而是导致错误的矩阵:

str(d)
 num [1:12, 1:11] 0.204 0.036 0.015 0.013 0.208 0.037 0.015 0.006 0.186 0.044 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:12] "0" "1" "2" "3" ...
  ..$ : chr [1:11] "age_77" "age_78" "age_79" "age_80" ...

此外,行号不是唯一的,在将d强制转换为data.frame时也会导致错误。

使用data.tabled可以强制转换为data.table对象,并使用melt()从宽格式转换为长格式。最后,从列名中提取age并将其存储为OP请求的整数值。

library(data.table)
melt(as.data.table(d), measure.vars = patterns("^age_"),
     variable.name = "age", value.name = "proportion")[
  , age := as.integer(stringr::str_replace(age, "age_", ""))][]
     item k age proportion
  1:    1 0  77      0.204
  2:    1 1  77      0.036
  3:    1 2  77      0.015
  4:    1 3  77      0.013
  5:    2 0  77      0.208
 ---                      
104:    2 3  85      0.010
105:    3 0  85      2.120
106:    3 1  85      0.466
107:    3 2  85      0.177
108:    3 3  85      0.121