根据列转置数据并保留重复的数据(与长格式的宽度不太相似)

时间:2018-12-01 22:37:07

标签: r reshape

从长到宽格式略有不同。 (请不要报告重复)

我有如下数据。我想根据术语列与主题列中的相应值进行转置。结果将类似于df_result:

DF <- data.frame(ID = c("10", "10", "10", "10", "10", "11", "11", "11", "12", "12"),
             term = c("1", "1", "2", "2", "3", "1", "1", "2", "1", "1"),
             subject = c("math1", "phys1", "math2", "chem1", "cmp1", "math1", "phys1", "math2", "math1", "phys1"),
             graduation = c ("grad", "grad", "grad", "grad", "grad", "drop", "drop", "drop", "enrolled", "enrolled"))

Df

ID   term   subject   graduation
10    1      math1      grad
10    1      phys1      grad
10    2      math2      grad
10    2      chem1      grad
10    3      cmp1       grad
11    1      math1      drop
11    1      phys1      drop
11    2      math2      drop
12    1      math1      enrolled
12    1      phys1      enrolled

Df_result:

ID  term1  term2  term3   graduation
10  math1  math2  cmp1     grad
10  phys1  chem1  NA       grad
11  math1  math2  NA       drop
11  phys1   NA    NA       drop
12  math1   NA    NA       Enrolled
12  math2   NA    NA       Enrolled

使用reshape会产生接近我想要的效果,但是只会保留第一个匹配项。

resjape(DF, idvar = c("ID","graduation"), timevar = "term", direction = "wide") 

它产生:

  ID graduation subject.1 subject.2 subject.3
1 10       grad     math1     math2      cmp1
6 11       drop     math1     math2      <NA>
9 12   enrolled     math1      <NA>      <NA>

问题是timevar仅保留第一个匹配项。 使用dcastmelt仅使用功能length填充数据。

如何在R中解决它?

1 个答案:

答案 0 :(得分:2)

这与从长到宽的重塑相同,但是您需要一个新变量来帮助您唯一标识新格式的行。我在下面将此变量称为classnum,并使用data.table的语法来帮助我创建它:

# add helper variable "classnum"
library(data.table)
setDT(DF)
DF[ , classnum := 1:.N, by=.(ID, term)]

#reshape long-to-wide
tidyr::spread(DF, term, subject)

结果:

   ID graduation classnum     1     2    3
1: 10       grad        1 math1 math2 cmp1
2: 10       grad        2 phys1 chem1 <NA>
3: 11       drop        1 math1 math2 <NA>
4: 11       drop        2 phys1  <NA> <NA>
5: 12   enrolled        1 math1  <NA> <NA>
6: 12   enrolled        2 phys1  <NA> <NA>