在R中重塑数据,跳过某些测量变量

时间:2013-01-25 16:55:52

标签: r reshape reshape2

我想重塑一个看起来像这样的data.frame

     permno         dte ttm var1 var2 var3
1    123  2012-01-01  20    1   10  100
2    123  2012-01-01  30   -1   10  100
3    124  2012-01-01  20    2   20  200
4    124  2012-01-01  30   -2   20  200

我想通过以下方式查看data.frame

  permno         dte var1_20 var1_30 var2 var3
1    123  2012-01-01       1      -1   10  100
2    124  2012-01-01       2      -2   20  200

我一直尝试使用reshape2软件包执行此操作,但我无法将var1与其他软件包隔离开来,并继续在结果中获取var2_20var2_30 。有谁知道如何使用reshape2包吗?

data.frame dput:

> dput(DF)
structure(list(permno = c(123L, 123L, 124L, 124L), dte = structure(c(1L, 
1L, 1L, 1L), .Label = " 2012-01-01", class = "factor"), ttm = c(20L, 
30L, 20L, 30L), var1 = c(1L, -1L, 2L, -2L), var2 = c(10L, 10L, 
20L, 20L), var3 = c(100L, 100L, 200L, 200L)), .Names = c("permno", 
"dte", "ttm", "var1", "var2", "var3"), class = "data.frame", row.names = c(NA, 
-4L))
> dput(result)
structure(list(permno = 123:124, dte = structure(c(1L, 1L), .Label = " 2012-01-01", class = "factor"), 
    var1_20 = 1:2, var1_30 = c(-1L, -2L), var2 = c(10L, 20L), 
    var3 = c(100L, 200L)), .Names = c("permno", "dte", "var1_20", 
"var1_30", "var2", "var3"), class = "data.frame", row.names = c(NA, 
-2L)) 

2 个答案:

答案 0 :(得分:3)

使用mergereshapeunique的组合,如下所示:

unique(merge(DF[-c(3:4)], 
             reshape(DF[1:4], direction = "wide", 
                     idvar = c("permno", "dte"), 
                     timevar="ttm")))
#   permno         dte var2 var3 var1.20 var1.30
# 1    123  2012-01-01   10  100       1      -1
# 3    124  2012-01-01   20  200       2      -2

基本上,您只重塑需要重新整形的列,并在合并之前从原始数据集中删除这些列。你最终会得到重复的行,所以只需将所有这些包装在unique中以获得(几乎)所需的输出。如果需要,您可以重新排列列顺序。

答案 1 :(得分:2)

我对这个答案感到相当聪明,但我强烈怀疑我对你的数据做了太多假设,特别是var2和var3的常数性质:

ddply(dat,.(permno,dte,var2,var3),
      function(x) { dcast(x,permno + dte + var2 + var3 ~ ttm,value.var = 'var1') })
  permno         dte var2 var3 20 30
1    123  2012-01-01   10  100  1 -1
2    124  2012-01-01   20  200  2 -2