将键/值列分组为单行

时间:2015-06-19 05:11:48

标签: r dataframe data.table tidyr

我正在尝试获取键值组合,并将所有值放在与键相同的行上。我很确定我知道如何在某一点上做到这一点(我认为有data.table)并且我一直在寻找通常的嫌疑人reshape2,tidyr,data.table等,但我似乎无法想象一个简单的解决方案。

key1 = c(1,1,1,1,2,2,2,2)
key2 = c("A","A","B","B","C","C","D","D")
value = c("a","b","c","d","e","f","g","h")
kvframe = data.frame(key1,key2,value)

#  key1 key2 value
#1    1    A     a
#2    1    A     b
#3    1    B     c
#4    1    B     d
#5    2    C     e
#6    2    C     f
#7    2    D     g
#8    2    D     h

这就是我希望表格的样子:

# key1 key2 value1 value2
#    1    A      a      b
#    1    B      c      d
#    2    C      e      f
#    2    D      g      h

大多数key1,key2对具有相同数量的相应值,但并非所有值都相同。我希望找到一个解决方案,其值列的数量等于任何给定键集的最大值,其中任何具有较少值的对都用NA填充。

1 个答案:

答案 0 :(得分:4)

您需要一个序列列'key1 / key2'。

library(data.table) # v1.9.5+
setDT(kvframe)[, Seq := paste0('value', 1:.N), by = .(key1, key2)] # generate Seq
dcast(kvframe, key1 + key2  ~Seq, value.var = 'value') # cast from long to wide

#   key1 key2 value1 value2
#1:    1    A      a      b
#2:    1    B      c      d
#3:    2    C      e      f
#4:    2    D      g      h

或使用reshape

中的base R
 d1 <- transform(kvframe, Seq=ave(seq_along(value),
              key1, key2, FUN=seq_along))
 reshape(d1, idvar=c('key1', 'key2'), timevar='Seq', direction='wide')
 #  key1 key2 value.1 value.2
 #1    1    A       a       b
 #3    1    B       c       d
 #5    2    C       e       f
 #7    2    D       g       h

或者

library(tidyr)
spread(d1, Seq, value)