使用包含多列的data.table来重新整形

时间:2018-01-15 14:27:36

标签: r data.table

我有一个宽格式的数据框,如下所示。我想使用data.table融化函数重塑宽到长。在简单的情况下,我可以拆分两个数据,然后rbind两个数据集。但在我的情况下,有多个test(i)testgr(i)列。但必须有一种更好,更有效的方法来做到这一点。 thx提前。

from =>

id<-c("106E1258","106E2037","104E1182","105E1248","105E1470","10241247",
"10241703")
yr<-c(2017,2017,2015,2016,2016,2013,2013)
finalgr<-c(72,76,75,71,75,77,78)
test01<-c("R0560","R0066","R0308","R0129","R0354","R0483",  
"R0503")
test01gr<-c(73,74,67,80,64,80,70)
test02<-c("R0660","R0266","R0302","R0139","R0324","R0383"   ,
"R0503")
test02gr<-c(71,54,67,70,68,81,61)
dt<-data.frame(id=id,yr=yr,
finalgr=finalgr,
test01=test01,test01gr=test01gr,
test02=test02,test02gr=test02gr)

要=&GT;

id2<-c("106E1258","106E1258","104E1182","104E1182")
yr2<-c(2017,2017,2015,2015)
finalgr<-c(72,72,75,75)
testid<-c("R0560","R0660","R0308","R0302")
testgr<-c(73,71,67,67)
dt2<-data.frame(id=id2,yr=yr2,finalgr=finalgr,testid=testid,testgr=testgr)

1 个答案:

答案 0 :(得分:4)

你确实应该使用melt

setDT(dt)
melt(dt, id.vars = c('id', 'yr', 'finalgr'), 
     measure.vars = list(testid = c('test01', 'test02'),
                         testgr = c('test01gr', 'test02gr')))
#           id   yr finalgr variable testid testgr
#  1: 106E1258 2017      72        1  R0560     73
#  2: 106E2037 2017      76        1  R0066     74
#  3: 104E1182 2015      75        1  R0308     67
#  4: 105E1248 2016      71        1  R0129     80
#  5: 105E1470 2016      75        1  R0354     64
#  6: 10241247 2013      77        1  R0483     80
#  7: 10241703 2013      78        1  R0503     70
#  8: 106E1258 2017      72        2  R0660     71
#  9: 106E2037 2017      76        2  R0266     54
# 10: 104E1182 2015      75        2  R0302     67
# 11: 105E1248 2016      71        2  R0139     70
# 12: 105E1470 2016      75        2  R0324     68
# 13: 10241247 2013      77        2  R0383     81
# 14: 10241703 2013      78        2  R0503     61

如果有更多test列,您可以使用patterns

melt(dt, id.vars = c('id', 'yr', 'finalgr'), 
     measure.vars = patterns(testid = 'test[0-9]+$', testgr = 'test[0-9]+gr'))

list中使用已命名的measure.vars是我刚刚填写的pull request的全新功能,目前仅在开发中可用(1.10.5)。有关安装说明,请参阅here,或等待1.10.6发布到CRAN。